I have two dataframes like below. Now, I want to compare two column from both df's and upon matching value it has to append the value of another column to the other df with a new column, but when trying to execute a for loop like below throws the below error. The same code works fine in jupyter notebook but throws error when executed in vs code. I am not understanding the issue, what is the best way to achieve it?
df1
id ids_list
0 1 [126, 238]
1 2 [126, 355]
2 3 [1265, 152, 238]
3 4 [1265, 1529,2384,17235]
df2
from_id to_id
0 1 2
1 3 1
2 2 1
3 4 2
4 2 3
for y,z in zip(df1['id'],df1['ids_list']):
df2.loc[df2.from_id == y, 'from_ids'] = z
df2.loc[df2.to_id == y, 'to_ids'] = z
When executed in jupyter notebook the excepted output is resulted:
from_id to_id from_ids to_ids
0 1 2 [126, 238] [126, 355]
1 3 1 [1265, 152, 238] [126, 238]
2 2 1 [126, 355] [126, 238]
3 4 2 [1265, 1529,2384,17235] [126, 355]
4 2 3 [126, 355] [1265, 152, 238]
But when running same code in vs code getting below error:
Must have equal len keys and value when setting with an iterable
CodePudding user response:
Use mapping with DataFrame.applymap
and dict.get
for possible empty list if no match:
d = dict(zip(df1['id'],df1['ids_list']))
f = lambda x: d.get(x, [])
df2[['from_ids','to_ids']] = df2[['from_id','to_id']].applymap(f)
print (df2)
from_id to_id from_ids to_ids
0 1 2 [126, 238] [126, 355]
1 3 1 [1265, 152, 238] [126, 238]
2 2 1 [126, 355] [126, 238]
3 4 2 [1265, 1529, 2384, 17235] [126, 355]
4 2 3 [126, 355] [1265, 152, 238]
If no match is possible missing values use:
df2[['from_ids','to_ids']] = df2[['from_id','to_id']].apply(lambda x: x.map(d))
print (df2)
from_id to_id from_ids to_ids
0 1 2 [126, 238] [126, 355]
1 3 1 [1265, 152, 238] [126, 238]
2 2 1 [126, 355] [126, 238]
3 4 2 [1265, 1529, 2384, 17235] [126, 355]
4 2 3 [126, 355] [1265, 152, 238]