Home > front end >  I am trying to do masking for column and need to mask based of number of characters in single cell i
I am trying to do masking for column and need to mask based of number of characters in single cell i

Time:10-06

Here in below code I am trying to do full masking, I acheived full masking but the problem is in below code I am using static "maskvalue" but that needs to be dynamic based on values in cells of particular column.

def masking(filename,columnname,value):
    maskvalue = "XXXXXXXXX"
    column_dataset1 = pd.read_csv(filename)
    print(column_dataset1)

    if value == 0:
        # mask entire row
        maskvalue = "XXXXXXXXX"
        column_dataset1[columnname] = column_dataset1[columnname].astype(str).str[:0]   maskvalue
        # print(column_dataset1)

    elif value == '':
        column_dataset1[columnname] = column_dataset1[columnname].astype(str).str[:0]   maskvalue
        print(column_dataset1)

    else:
        column_dataset1[columnname] = column_dataset1[columnname].astype(str).str[:-value]   maskvalue
        print(column_dataset1)

masking("path/to/file","phonenumber",0)

For example I am using below data:

sno,Name,Type 1,Type 2,phonenumber
1,Bulbasaur,Grass,Poison,987654321256464684846646464611631646466464
2,Ivysaur,Grass,Poison,98765432121314564645663114646464666432016364
3,Venusaur,Grass,Poison,9876543212
3,VenusaurMega Venusaur,Grass,Poison,9876543212
4,Charmander,Fire,Flying,9876543212

and I am getting output like this:

sno                       Name Type 1  Type 2 phonenumber
0     1                  Bulbasaur  Grass  Poison   XXXXXXXXX
1     2                    Ivysaur  Grass  Poison   XXXXXXXXX
2     3                   Venusaur  Grass  Poison   XXXXXXXXX
3     3      VenusaurMega Venusaur  Grass  Poison   XXXXXXXXX
4     4                 Charmander   Fire  Flying   XXXXXXXXX

Here if I select column as "Type 1" the masking characters should be "XXXXX" since Grass have 5 characters Expected Output:

   sno,Name,Type 1,Type 2,phonenumber
    1,Bulbasaur,XXXXX,Poison,987654321256464684846646464611631646466464
    2,Ivysaur,XXXXX,Poison,98765432121314564645663114646464666432016364
    3,Venusaur,XXXXX,Poison,9876543212
    3,VenusaurMega Venusaur,XXXXX,Poison,9876543212
    4,Charmander,XXXX,Flying,9876543212

Note: Masking should be done on both string and integer columns

CodePudding user response:

How about doing it this way?

column_dataset1[columnname] = ['X'*len(i) for i in column_dataset1[columnname].astype(str)]
  • Related