i have csv, and two columns of that csv is exponential notation.
I want to convert it to float or int, and doing this:
df = pd.read_csv('clm_data.csv', sep = ';')
pd.set_option('display.float_format', '{:.0f}'.format)
df.dtypes
those types:
NLS object
CAMPID object
RESPONSE_TYPE int64
RESPONSE_DATE object
SYSTEM_LK object
func for transformation:
def unfloater(x):
value = x.replace(',', '.')
return float(value)
and when i try to use apply for multiple rows like:
df[["NLS", "RESPONSE_DATE"]] = df[["NLS", "RESPONSE_DATE"]].apply(unfloater)
i got TypeError: cannot convert the series to <class 'float'>
But when i use apply for one series - all works well.
What is wrong with multiple apply? axis = 1 is useless
CodePudding user response:
You are now passing a row to your function instead of a single value, so float()
does not work.
You can use .astype(float)
instead like this:
df = pd.DataFrame({"a": ["1,1e1", "2,2e2", "3,3e3", "4,4e4"],
"b": ["5,5e5", "6,6e6", "7,7e7", "8,8e8"]})
df[["a", "b"]].apply(lambda row: row.str.replace(",", ".")).astype(float)
CodePudding user response:
You can use one of the following methods :
Method 1: Use astype()
df['NLS'] = df['NLS'].astype(float)
Method 2: Use to_numeric() to Convert Object to Float
df['NLS'] = pd.to_numeric(df['NLS'], errors='coerce')
to convert all
df = df.apply(pd.to_numeric)
df.dtypes