DELPHIX MASKING AND HEALTHCARE FILE RECORDS



This article was motivated after one of my insurance prospects challenged me on masking the medical multiple record format file he exchange with other insurance companies and healthcare facilities (normed B2 in France).

The hardest part of work was understanding the B2 standard before starting, but i'm glad as i've got his help to clarify the file structure that i will try to explain at this point.




This pic. highlights the structure of a series of records inside the files that healthcare companies are exchanging, here is the explanation of each section of it.

Type 000       : Start of file or record
Type 1         : Start of batch
Type 2         : Start of invoice
         A : Start
         S : Following
         M : invoice supplement
Type 3 and 3n  : Bill for Hospital or clinics (CP)
Type 4 and 4n  : Medical act
         A : Start
         S : Following
Type 5     : Invoice end
Type 6     : Batch end
Type 999   : End of file/record


Now that you've got an example of the French complexity :), let's try to mask a sample healthcare file.

Connect to the masking console and create a connector, that will be pointing to the file to be masked



Create the ruleste using the previous connector and select the file to be masked



Now let's teach our masking engine how to recognize the content of this file.

Start by creating an empty file, i'm saving it with "ff_vide.txt" name.

Once done, import the file to the masking engine under "Settings" tab "file format" showed here.



Edit the file at the level of the ruleset "B2_RSand define the file format to "ff_vide.txt" we just uploaded.



Let's now feed the inventory of this ruleset, with format files that will help masking engine recognizing the fields on each line of the file to be masked. 

Simple output of healthcare file record we're targeting to mask.





Let's explain the content provided above, please notice that we are interested on masking the ssn numbers on each line starting with 2, 4, 5 and the special line "2 with the character S at position 36".

2750010407 15903xxxxxx0786000914329A  1170812   9192105064750010407170812   100000000000000000 5903301          0015050027000000

This line is of type 2 and have the ssn number at position 12 for a length of 13 characters. 

2750010407 15903xxxxxx0786000914329S             15903xxxxxx0786000000                                         000000000
This special line is of type 2S and have the ssn number both at position 12 and 50 for a length of 13 characters.

4750010407 15903xxxxxx0786000914329S01S 0002128 0000000                0000000000000000000000000000 00000000750560305010000


5750010407 15903xxxxxx0786000914329   006                000133000000931000003990000039904 T1

The third line is of type 4 and 5 and have the ssn number at position 12 for a length of 13 characters. 

Notice that for all of the lines of types 4, 4A, 4S, 4M, 5, 5A, 5S or 5M have the ssn number at the same position 12 for a length of 13 characters.

Now you get an idea on the file structure, let's move and create the according format files per set of record or records.
[delphixdemo.DESKTOP-6JAVIRB] cat ff_b2_2.txt
champ1,1
champ2,2
champ3,12
champ4,25                    
───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
[12/12/2017 13:30.41]  /drives/c/Users/delphixdemo/Desktop/Delphix/POV/
[delphixdemo.DESKTOP-6JAVIRB] cat ff_b2_s2.txt
type,1
ch1,2
ssn1,12
comp1,25
typea,36
comp2,37
ssn2,50
eof,63                 
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
[delphixdemo.DESKTOP-6JAVIRB] cat ff_b2_4.txt
champ1,1
champ2,2
champ3,12
champ4,25
───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
[12/12/2017 13:30.27]  /drives/c/Users/delphixdemo/Desktop/Delphix/POV/
[delphixdemo.DESKTOP-6JAVIRB] cat ff_b2_5.txt
champ1,1
champ2,2
champ3,12
champ4,25                          
─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
[12/12/2017 13:30.50]  /drives/c/Users/delphixdemo/Desktop/Delphix/POV/MGEN
[delphixdemo.DESKTOP-6JAVIRB] cat ff_all.txt
ch,1



Once done define the record types at the inventory of "B2_RS" ruleset,to indicate to the masking engine how to recognize the record lines.

Record type 2 :




We are looking here for the value 2 at position 1 for a length of 1 character on every line in file record.

Record type 2S :




This record type will look for the value 2 at position 1 for a length of 1 character and have "S" value at position 36 on every line in file record.

Record type 4 :




This record type will look for the value 4 at position 1 for a length of 1 character on every line in file record.

Record type 5 :




This record type will look for the value 4 at position 1 for a length of 1 character on every line in file record.

Record type (all records) : 




This record type will treat all the rest of lines that aren't of the precedent record types.

Voilà, let's define a masking algorithm for the ssn fields. (for ease and fast testing i apply random algorithm to mask the ssn numbers).



Ready to run the test, create a masking job to masking the test file.



After running the job here are the results of our job.

Values before masking file



Values after applying masking job



As this output shows, we masked the medical record file by masking the ssn number fields on each line.












No comments:

Post a Comment