I have a data-frame:
df = pd.DataFrame({"ID": [1,2,3,4,5,6,7], "value": [10,10.00,"123JK",20,10-11,11.00,12.00]})
ID value
1 10
2 10.00
3 123JK
4 20
5 10-11
6 11.00
7 12.00
I want to convert only the floating value to integer such that:
ID value
1 10
2 10
3 123JK
4 20
5 10-11
6 11
7 12
I tried following code:
df['ID'] = df['ID'].apply(pd.to_numeric, errors='ignore')
df['ID'].astype(np.int64,errors='ignore')
But It does not convert all the floating value to integer.
CodePudding user response:
If need integers only for floats integers like 10.0
use custom function:
def f(x):
try:
x = pd.to_numeric(x, errors='ignore')
if int(x) == x:
return int(x)
else:
return x
except:
return x
df['value'] = df['value'].apply(f)
print (df)
ID value
0 1 10
1 2 10
2 3 123JK
3 4 20
4 5 10-11
5 6 11
6 7 12
CodePudding user response:
Well assuming that the value
column be text, you could just do a regex replacement here:
df["value"] = df["value"].str.replace(r'^(\d )\.\d $', r'\1')
CodePudding user response:
I prefer pd.to_numeric
with fillna
:
df['value'] = pd.to_numeric(df['value'], errors='coerce').fillna(df['value'])
This will convert to numerical dtype
s and replace non-numeric data to NaN
. Then I fill the NaN
s with the original column with strings, this will keep the numerical data numeric.