This article was motivated after one of my insurance prospects challenged me on masking the medical multiple record format file he exchange with other insurance companies and healthcare facilities (normed B2 in France).
The hardest part of work was understanding the B2 standard before starting, but i'm glad as i've got his help to clarify the file structure that i will try to explain at this point.
This pic. highlights the structure of a series of records inside the files that healthcare companies are exchanging, here is the explanation of each section of it.
Type 000 : Start of file or record Type 1 : Start of batch Type 2 : Start of invoice A : Start S : Following M : invoice supplement Type 3 and 3n : Bill for Hospital or clinics (CP) Type 4 and 4n : Medical act A : Start S : Following Type 5 : Invoice end Type 6 : Batch end Type 999 : End of file/record
Now that you've got an example of the French complexity :), let's try to mask a sample healthcare file.
Connect to the masking console and create a connector, that will be pointing to the file to be masked
Create the ruleste using the previous connector and select the file to be masked
Now let's teach our masking engine how to recognize the content of this file.
Start by creating an empty file, i'm saving it with "ff_vide.txt" name.
Once done, import the file to the masking engine under "Settings" tab "file format" showed here.
Edit the file at the level of the ruleset "B2_RS" and define the file format to "ff_vide.txt" we just uploaded.
Let's now feed the inventory of this ruleset, with format files that will help masking engine recognizing the fields on each line of the file to be masked.
Simple output of healthcare file record we're targeting to mask.
Let's explain the content provided above, please notice that we are interested on masking the ssn numbers on each line starting with 2, 4, 5 and the special line "2 with the character S at position 36".
2750010407 15903xxxxxx0786000914329A 1170812 9192105064750010407170812 100000000000000000 5903301 0015050027000000
This line is of type 2 and have the ssn number at position 12 for a length of 13 characters.
2750010407 15903xxxxxx0786000914329S 15903xxxxxx0786000000 000000000
This special line is of type 2S and have the ssn number both at position 12 and 50 for a length of 13 characters.
4750010407 15903xxxxxx0786000914329S01S 0002128 0000000 0000000000000000000000000000 00000000750560305010000
5750010407 15903xxxxxx0786000914329 006 000133000000931000003990000039904 T1
The third line is of type 4 and 5 and have the ssn number at position 12 for a length of 13 characters. 5750010407 15903xxxxxx0786000914329 006 000133000000931000003990000039904 T1
Notice that for all of the lines of types 4, 4A, 4S, 4M, 5, 5A, 5S or 5M have the ssn number at the same position 12 for a length of 13 characters.
Now you get an idea on the file structure, let's move and create the according format files per set of record or records.
[delphixdemo.DESKTOP-6JAVIRB] cat ff_b2_2.txt champ1,1 champ2,2 champ3,12 champ4,25 ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── [12/12/2017 13:30.41] /drives/c/Users/delphixdemo/Desktop/Delphix/POV/ [delphixdemo.DESKTOP-6JAVIRB] cat ff_b2_s2.txt type,1 ch1,2 ssn1,12 comp1,25 typea,36 comp2,37 ssn2,50 eof,63 ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── [delphixdemo.DESKTOP-6JAVIRB] cat ff_b2_4.txt champ1,1 champ2,2 champ3,12 champ4,25 ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── [12/12/2017 13:30.27] /drives/c/Users/delphixdemo/Desktop/Delphix/POV/ [delphixdemo.DESKTOP-6JAVIRB] cat ff_b2_5.txt champ1,1 champ2,2 champ3,12 champ4,25 ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── [12/12/2017 13:30.50] /drives/c/Users/delphixdemo/Desktop/Delphix/POV/MGEN [delphixdemo.DESKTOP-6JAVIRB] cat ff_all.txt ch,1
Once done define the record types at the inventory of "B2_RS" ruleset,to indicate to the masking engine how to recognize the record lines.
Record type 2 :
Record type 2S :
This record type will look for the value 2 at position 1 for a length of 1 character and have "S" value at position 36 on every line in file record.
Record type 4 :
This record type will look for the value 4 at position 1 for a length of 1 character on every line in file record.
Record type 5 :
This record type will look for the value 4 at position 1 for a length of 1 character on every line in file record.
Record type (all records) :
This record type will treat all the rest of lines that aren't of the precedent record types.
Voilà, let's define a masking algorithm for the ssn fields. (for ease and fast testing i apply random algorithm to mask the ssn numbers).
Ready to run the test, create a masking job to masking the test file.
After running the job here are the results of our job.
Values before masking file
Values after applying masking job
As this output shows, we masked the medical record file by masking the ssn number fields on each line.
No comments:
Post a Comment