Home > Software design >  Replacement Values into the integer on dataset columns
Replacement Values into the integer on dataset columns

Time:11-25

House Number Street First Name Surname Age Relationship to Head of House Marital Status Gender Occupation Infirmity Religion
0 1 Smith Radial Grace Patel 46 Head Widowed Female Petroleum engineer None Catholic
1 1 Smith Radial Ian Nixon 24 Lodger Single Male Publishing rights manager None Christian
2 2 Smith Radial Frederick Read 87 Head Divorced Male Retired TEFL teacher None Catholic
3 3 Smith Radial Daniel Adams 58 Head Divorced Male Therapist, music None Catholic
4 3 Smith Radial Matthew Hall 13 Grandson NaN Male Student None NaN
5 3 Smith Radial Steven Fletcher 9 Grandson NaN Male Student None NaN
6 4 Smith Radial Alison Jenkins 38 Head Single Female Physiotherapist None Catholic
7 4 Smith Radial Kelly Jenkins 12 Daughter NaN Female Student None NaN
8 5 Smith Radial Kim Browne 69 Head Married Female Retired Estate manager/land agent None Christian
9 5 Smith Radial Oliver Browne 69 Husband Married Male Retired Merchandiser, retail None None

Hello,

I have a dataset that you can see below. When I tried to convert Age to int. I got that error: ValueError: invalid literal for int() with base 10: '43.54302670766108'

This means there is float data inside that data. I tried to replace '.' to '0' then tried to convert but I failed. Could you help me to do that?

df['Age'] = df['Age'].replace('.','0')
df['Age'] = df['Age'].astype('int')

I still got the same error. I think replace line is not working. Do you know why?

Thanks

CodePudding user response:

Try:

df['Age'] = df['Age'].replace('\..*$', '', regex=True).astype(int)

Or, more drastic:

df['Age'] = df['Age'].replace('^(?:.*\D.*)?$', '0', regex=True).astype(int)

CodePudding user response:

You do not need to manipulate the strings; you might first convert values to float then to int like:

df["Age"] = df["Age"].astype('float').astype('int') 
  • Related