I'm looking for an alternative for 'lookup' function in pandas dataframe.
I've seen several examples including this documentation, but I can't apply this to on my own.
I have 2 DataFrames described as below.
A
| name |
| ---- |
|foo |
|bar |
|hello |
|world |
B
|keyword|frequency|
| ----- |---------|
|hello | 3 |
|foo | 10 |
And I wanna add a new column on A
dataframe like this
A
| name |frequency|
| ---- |---------|
|foo |10 |
|bar |0 |
|hello |3 |
|world |0 |
As you see, if a value of A
dataframe's name
column is in B
dataframe's keyword
column, add frequency from the same row.
If it doesn't exist, just add 0
.
I can achieve this with for-loop, but this is not looks good and also slow.
Is there any elegant and efficient way to do this ? (I hope to utilize vectorization of pands, if possible)
Thanks.
CodePudding user response:
If need add one or multiple columns use DataFrame.merge
with rename
and DataFrame.fillna
df = df1.merge(df2.rename(columns={'keyword': 'name'}), how='left').fillna(0)
print (df)
name frequency
0 foo 10.0
1 bar 0.0
2 hello 3.0
3 world 0.0
Or if need add only one column use Series.map
with Series.fillna
:
df1['frequency'] = (df1['name'].map(df2.set_index('keyword')['frequency'])
.fillna(0, downcast='int'))
print (df1)
name frequency
0 foo 10
1 bar 0
2 hello 3
3 world 0