Home > Software design >  Iterate over df and replace values with a dict
Iterate over df and replace values with a dict

Time:10-29

I have a similar situation as this example. I have a pandas dataframe with 5 rows and 5 columns.

I have in this df only 0 and 1:

Sample dataframe

And i have a dict that tell me that for example that the value in the 1st column is 'G' if it 0 or 'A' if the value is 1, like this one:

 dict = {0: {'0': 'G', '1': 'A'},
     1: {'0': 'G', '1': 'A'},
     2: {'0': 'T', '1': 'A'},
     3: {'0': 'G', '1': 'A'},
     4: {'0': 'A', '1': 'C'},
     5: {'0': 'C', '1': 'A'}}

That's my question.. how can i iterate over row and columns to replace to 0, 1 with the dict value on my df?

Expected result for the first two rows:

| A | A | A | A | A | A |
|:--|:--|:--|:--|:--|--:|
| G | G | T | G | A | C |

CodePudding user response:

You can do this with replace:

>>> df.astype(str).replace(my_dict)

   0  1  2  3  4  5
0  A  A  A  A  A  A
1  G  G  T  G  A  C
2  G  G  T  G  A  C
3  G  G  T  G  A  A
4  A  A  A  A  A  C

As an aside, don't call your dictionary dict. I've used my_dict in my example.

CodePudding user response:

You can also do this way:

for i in range(6):
    df.iloc[:, i] = df.iloc[:, i].apply(lambda x: my_dict[i][str(x)])

So you can target only the columns you want.

CodePudding user response:

data = [[random.randint(0,1) for i in range(5)] for j in range(5)]
df = pd.DataFrame(data).astype(str)

enter image description here

mapper =  {0: {'0': 'G', '1': 'A'},
     1: {'0': 'G', '1': 'A'},
     2: {'0': 'T', '1': 'A'},
     3: {'0': 'G', '1': 'A'},
     4: {'0': 'A', '1': 'C'},
     5: {'0': 'C', '1': 'A'}}

def maps(row):
    transform = mapper[row.name]
    for i in range(len(row)):
        row[i] = transform[row[i]]
    return row

df.apply(maps, axis=1)

enter image description here

  • Related