Home > Blockchain >  How to add column with catagories based on another categorical column?
How to add column with catagories based on another categorical column?

Time:10-16

I want to assign music style to genre so I created a few arrays where I define which genre is which music style. Then I need to assign it properly to each genre. My code below doesn't work as expected: it always assign 'Techno' to every genre. Why?

Rap = ['Dark Trap', 'Hiphop', 'Rap', 'Underground Rap', 'RnB', 'trap', 'Trap Metal']
Techno = ['dnb', 'hardstyle', 'psytrance', 'techhouse', 'techno', 'trance']
Pop = ['pop']

for r in df['genre']:
    if r in Rap:
        df['genre_cat'] = 'Rap'
    elif r in Techno:
        df['genre_cat'] = 'Techno'
    else:
        df['genre_cat'] = 'Pop'
        
df[['genre', 'genre_cat']]

enter image description here

CodePudding user response:

The way you have written your code, it is assigning 'Rap', 'Techno', or 'pop' to all the df['genre_cat'] column everytime; so, whatever is the last value in df['genre'], it will be assigned the corresponding value to df['genre_cat'].

To improve your code, you can do this

for index, row in df.iterrows():
    r = row['genre']
    if r in Rap:
        df.loc[index, 'genre_cat'] = 'Rap'
    elif r in Techno:
        df.loc[index, 'genre_cat'] = 'Techno'
    else:
        df.loc[index, 'genre_cat'] = 'Pop'

The second method, which I would prefer is

df.loc[df['genre'].isin(Rap), 'genre_cat'] = 'Rap'
df.loc[df['genre'].isin(Techno), 'genre_cat'] = 'Techno'
df.loc[df['genre'].isin(Pop), 'genre_cat'] = 'Pop'

CodePudding user response:

You can use apply with lambda

df['genre_cat'] = df['genre'].apply(lambda x: 'Rap' if x in Rap else 'Techno' if x in Techno else 'Pop' if x in Pop else '')

CodePudding user response:

The issue with your code is:

df['genre_cat'] = 'Techno'

which sets all values of the column to Techno

Check the last row of the dataframe, it is hardstyle, at the end of the for loop the column genre_cat is set to Techno

I hope this solves your problem

res = []
for r in df['genre']:
    if r in Rap:
        res.append('Rap')
    elif r in Techno:
        res.append('Techno')
    else:
        res.append('pop')
df['genre_cat'] = res
df[['genre','genre_cat']]

Output

enter image description here

  • Related