I'm trying to iterate over a column in a dataframe and when the value matches a key from my dictionary it should then replace the value in another column with the value of the matching key.
df = pd.DataFrame({'id': ['123', '456', '789'], 'Full': ['Yes', 'No', 'Yes'], 'Cat':['','','']})
cats = {'123':'A', '456':'B', '789':'C'}
for val in df.id:
for key, cat in cats.items():
if key == val:
df.Cat.loc[(df.Full == 'Yes')] = cat
df
id Full Cat
0 123 Yes C
1 456 No
2 789 Yes C
I would expect id 123 to have a Cat of 'A' but instead it only returns 'C'
Can anyone explain to me why the it isn't iterating over the keys in dictionary?
CodePudding user response:
You can use Series.replace
and pass the dictionary, and assign the result to Cat
column:
>>> df['Cat'] = df.id.replace(cats)
#output:
id Full Cat
0 123 Yes A
1 456 No B
2 789 Yes C
Or, if you intend to replace in only the rows with Full
as Yes
one way is to simply apply a function on axis=1
then implement the logic for each rows:
>>> df['Cat'] = df.apply(lambda x: cats.get(x.id, '') if x.Full == 'Yes' else '',
axis=1)
id Full Cat
0 123 Yes A
1 456 No
2 789 Yes C
CodePudding user response:
For filtered values in column use dict.get
:
mask = df.Full == 'Yes'
df.loc[mask, 'Cat'] = df.loc[mask, 'id'].apply(lambda x: cats.get(x, ''))
print (df)
id Full Cat
0 123 Yes A
1 456 No
2 789 Yes C
If no match in dict is possible create None
s use:
cats = {'123':'A', '456':'B', '7890':'C'}
mask = df.Full == 'Yes'
df.loc[mask, 'Cat'] = df.loc[mask, 'id'].apply(cats.get)
print (df)
id Full Cat
0 123 Yes A
1 456 No
2 789 Yes None