Home > Software design >  Encoding in Python such that numbering starts with 1
Encoding in Python such that numbering starts with 1

Time:01-13

I have a dataframe, wherein the column 'team' needs to be encoded.

These are my codes:

#Load the required libraries
import pandas as pd
import numpy as np
from sklearn.preprocessing import LabelEncoder


#Create dictionary
data = {'team': ['A', 'A', 'B', 'B', 'C'],
        'Income': [5849, 4583, 3000, 2583, 6000],
        'Coapplicant Income': [0, 1508, 0, 2358, 0],
        'LoanAmount': [123, 128, 66, 120, 141]}

#Convert dictionary to dataframe
df = pd.DataFrame(data)
print("\n df",df)


# Initiate label encoder
le = LabelEncoder()
 
# return encoded label
label = le.fit_transform(df['team'])
 
# printing label
print("\n label =",label )


# removing the column 'team' from df
df.drop("team", axis=1, inplace=True)

 
# Appending the array to our dataFrame
df["team"] = label
 
# printing Dataframe
print("\n df",df)

I am getting the below result after encoding:

enter image description here

However, I wish to ensure following two things:

  1. Encoding starts with 1 and not 0
  2. The location of column 'team' should remain the same as original i.e. I wish to have following result:

enter image description here

Can somebody please help me out how to do this ?

CodePudding user response:

Do not drop the column and increment the label on assignment:

le = LabelEncoder()
 
# return encoded label
label = le.fit_transform(df['team'])
 
# Replacing the column
df["team"] = label   1

Output:

df team Income Coapplicant Income LoanAmount
0 1 5849 0 123
1 1 4583 1508 128
2 2 3000 0 66
3 2 2583 2358 120
4 3 6000 0 141
  • Related