I have a dataframe with a column that is objects. When i check for null values the data set says it has no null values but there is one row thats has a ' ' for age. I want to convert the column from object to int but that 1 value is giving me a hard time.
Here is what i have tried:
df['perpetrator_age'].replace('', '0', regex=True)
df.head()
Which does not replace the value.
df['perpetrator_age'].astype(int)
ValueError: invalid literal for int() with base 10: ' '
After some searching i thought maybe try convert it to a float first but :
df['perpetrator_age'].astype(float).astype(int)
ValueError: could not convert string to float: ''
Any help appreciated!
CodePudding user response:
As I mentioned in my comment, I guess the assingment is missing.
Here is a very basic example with an ''
string.
import pandas as pd
df = pd.DataFrame({'a':[1,2,3,'',4,5]})
df['a'].replace('',0, inplace=True)
# df['a'] = df['a'].replace('',0) as an equivilant
>>> df.head()
a
0 1
1 2
2 3
3 0
4 4
5 5
CodePudding user response:
try:
df['perpetrator_age'].astype('int')
Don't forget the ' '
CodePudding user response:
Based on this error message ValueError: invalid literal for int() with base 10: ' '
the issue that you have is that there is a field with one space, not an empty string. This is why the replace is failing. If you run df['perpetrator_age'].replace(' ', '0', regex=True)
this will work.
CodePudding user response:
Well friends, it turns out a simple inplace=True solved all of my problems!
df['perpetrator_age'].replace(' ', 0, regex=False, inplace=True)
df['perpetrator_age'].astype(int)
Thank you everyone for your help :)