I have dataframe a:
TagID Genre
0 0 rock
1 1 pop
2 2 favorites
3 3 alternative
4 4 love
and dataframe b:
Tags
0 154
1 20 35 40 65
I want the result like this:
Genre
0 wjlb-fm
1 chill, rnb, loved, hip hop
CodePudding user response:
Explode your Tags
column before joining to first dataframe:
df2['Genre'] = (df2['Tags'].str.split().explode().astype(df1['TagID'].dtype)
.map(df1.set_index('TagID')['Genre'])
.groupby(level=0).agg(', '.join))
print(df2)
# Output
Tags Genre
0 3 alternative
1 1 4 2 pop, love, favorites
Step by step:
# 1. Explode your column
>>> out = df2['Tags'].str.split().explode().astype(df1['TagID'].dtype)
0 3
1 1
1 4
1 2
Name: Tags, dtype: int64
# 2. Match genre by tag id
>>> out = out.map(df1.set_index('TagID')['Genre'])
0 alternative
1 pop
1 love
1 favorites
Name: Tags, dtype: object
# 3. Reshape your dataframe
>>> out = out.groupby(level=0).agg(', '.join)
0 alternative
1 pop, love, favorites
Name: Tags, dtype: object
CodePudding user response:
You can create dictionary by TagID
and Genre
and then in generator split values by space with mapping numbers converted to dict:
print (df2)
Tags
0 1
1 2 3 4 5
d = df1.set_index('TagID')['Genre'].to_dict()
f = lambda x: ' '.join(d[int(y)] for y in x.split() if int(y) in d)
df2['Genre'] = df2['Tags'].apply(f)
print (df2)
Tags Genre
0 1 pop
1 2 3 4 5 favorites alternative love