I have following dataframe named df.
id | letter |
---|---|
1 | x,y |
2 | z |
3 | a |
The mapping condition is {'x' : 1, 'z' : 2, 'ELSE' : 0}
my desired output dataframe should look like,
id | letter | map |
---|---|---|
1 | x,y | 1 |
2 | z | 2 |
2 | a | 0 |
Which means, even any of the letters in column letter
is x, then the column map
should be 1.
Without iterating through each row of the dataframe, is there any way to do that?
CodePudding user response:
You can use
pure pandas
cond = {'x' : 1, 'z' : 2, 'ELSE' : 0}
df['map'] = (df['letter']
.str.split(',').explode()
.map(lambda x: cond.get(x, cond['ELSE']))
.groupby(level=0).max()
)
In case of multiple values I would get the max.
Alternative for the first valid match:
df['map'] = (df['letter']
.str.split(',').explode()
.map(cond)
.groupby(level=0).first()
.fillna(cond['ELSE'], downcast='infer')
)
list comprehension
Or using a list comprehension, here the first valid match would be used:
cond = {'x' : 1, 'z' : 2, 'ELSE' : 0}
df['map'] = [next((cond[x] for x in s.split(',') if x in cond),
cond['ELSE']) for s in df['letter']]
id letter map
0 1 x,y 1
1 2 z 2
2 3 a 0
CodePudding user response:
use np.select
import numpy as np
cond1 = df['letter'].str.contains('x')
cond2 = df['letter'].str.contains('z')
df.assign(map=np.select([cond1, cond2], [1, 2], 0))
output:
id letter map
0 1 x,y 1
1 2 z 2
2 3 a 0