Home > other >  How do I change value names to numbers in dataframe?
How do I change value names to numbers in dataframe?

Time:07-31

I have a dataframe which has 10k movie names and 40k actor names.

enter image description here

The reason is I'm trying to make a graph from nx but the graphic becomes unreadable because of the names of the actor. So I want to change their names to numbers. Some of these actors played on multiple movies which means they are exists more than once. I want to change all these actors to numbers like 'Leslie Howard' = '1' and so on. I tried some loops and lists but I failed. I want to make a dictionary to be able to check which number was which actor. Can you help me?

CodePudding user response:

You could get all unique names of the column, generate a dictionary and then use map to change the values to the numbers. At the same time you have the dictionary to check to which actor the number refers.

all_names = df['Actor_Name'].unique()
dic = dict((v,k) for k,v in enumerate(all_names))

df['Actor_Name'] = df['Actor_Name'].map(dic)

CodePudding user response:

You can just do factorize

df['Movie_name'] = df['Movie_name'].factorize()[0]
df['Actor_name'] = df['Actor_name'].factorize()[0]

CodePudding user response:

Convert the column into type category and get their unique values with .cat.codes:

df['Actor_Name'] = df['Actor_Name'].astype('category').cat.codes
  • Related