I have a pd
df
A B C D
0 1,598 65.79 79 asdf
1 -300 46.90 90 qwer
how can I say to aply this: `
dff = dff.astype('str').apply(lambda x: pd.to_numeric(x.str.replace(',', '')))
` to only the columns with ","'s?
take into consider that I don't know the keys of the columns or orders
I don't know very well how to approach this
CodePudding user response:
you can check numeric columns or object columns that all values in column can convert to numeric
df.astype('str').apply(lambda x: x.str.contains(r'^[0-9-.,] $')).all()
output:
A True
B True
C True
D False
dtype: bool
A, B, C
you can filter columns by boolean indexing
cond1 = df.astype('str').apply(lambda x: x.str.contains(r'^[0-9-.,] $')).all()
df.loc[:, cond1]
output:
A B C
0 1,598 65.79 79
1 -300 46.90 90
you can convert only that columns to numeric
cols = df.loc[:, cond1].columns
df[cols] = df[cols].astype('str').apply(lambda x: pd.to_numeric(x.str.replace(',', '')))
df
output:
A B C D
0 1598 65.79 79 asdf
1 -300 46.90 90 qwer
you can check data type of result:
df.dtypes
A int64
B float64
C int64
D object
dtype: object
A, B and C columns convert to numeric, D column is still object