Replace string values in pandas rows with numbers with the help of for loop-CodePudding

I have 10 unique values in one column from my dataframe. For example below is the dataframe

df['categories'].unique()

output is :

Electronic
Computers
Mobile Phone
Router
Food

I want to replace 'Electronic' with 1, 'Computers' with 2, 'Mobile Phone' with 3, 'Router' with 4 and 'Food' with 5. The expected output must be

df['categories'].unique()

Expected output:

I tried looping the df['categories'].unique(), but i'm unable to do that. Can anyone help me with this?

CodePudding user response：

This will work:

new_vals = {'Electronic': 1, 'Computers' : 2, 'Mobile Phone' : 3, 'Router' : 4 , 'Food' : 5}
df = df.replace({'categories': new_vals})

CodePudding user response：

you could try this:

df['categories'] = df['categories'].astype('category').cat.codes

CodePudding user response：

scikit-learn provides similar functionality.

This approach is optimal when you are trying to build a predictive model and the codes do not play a role:

For example, it does not matter to you that: "Computers" category will get a code of '1' or '2' or '5'.

from sklearn.preprocessing import OrdinalEncoder

enc = OrdinalEncoder()
df['categories'] = enc.fit_transform(X=df[['categories']]).astype('int')