I have a dataframe which has columns of arrays:
id_food1 id_food2
[1] NaN
[2] NaN
[2 3] [1]
I want to map thse columns to a dict with values:
food_dict = {1: 'cake',
2: 'choco',
3: 'cream'}
I want to have something like this :
id_food1 id_food2 id_food1_name id_food2_name
[1] NaN. [cake] 0
[2] NaN [choco] 0
[2 3] [1] [choco,cream] [cake]
I know how to do it when the columns are not array like this
data['id_food1_name'] = data['id_food1'].map(food_dict)
but unable to do it when it is an array.
Any help will be highly appreciated
CodePudding user response:
Use Series.explode
for flatten values, mapping and last aggregate list pre index:
data['id_food1_name'] = (data['id_food1'].explode().astype(float)
.map(food_dict).groupby(level=0).agg(list))
For all columns:
#converting strings to lists
import ast
c = ['id_food1', 'id_food2']
def f(x):
try:
return ast.literal_eval(x)
except:
return np.nan
data[c] = data[c].applymap(f)
Alternative solution for convert to lists:
data[c] = data[c].stack().str.strip('[]').str.split().unstack()
And then mapping
for x in c:
f = lambda x: [food_dict.get(int(y)) for y in x if int(y) in food_dict]
data[f'{x}_name'] = data[x].dropna().apply(f)
data[f'{x}_name'] = data[f'{x}_name'].fillna(0)
print (data)
id_food1 id_food2 id_food1_name id_food2_name
0 [1] NaN [cake] 0
1 [2] NaN [choco] 0
2 [2, 3] [1] [choco, cream] [cake]
CodePudding user response:
You could use a dict comprehension in which you explode
map
groupby
agg(list)
(with if-else to convert NaN to 0); then assign
it back to df
.
df = (df.assign(**{f'id_food{i}_name': df[f'id_food{i}']
.explode()
.map(food_dict)
.groupby(level=0)
.agg(lambda x: x.tolist() if x.notna().all() else 0)
for i in range(1,df[['id_food1','id_food2']].shape[1] 1)}))
Output:
id_food1 id_food2 id_food1_name id_food2_name
0 [1] NaN [cake] 0
1 [2] NaN [choco] 0
2 [2, 3] [1] [choco, cream] [cake]