I am trying to turn a set of columns into a float object but I keep getting a value error. I have tried to use .astype('float')
and I still end up with the same error. The below is the code I am using right now.
for column in pct_columns:
df[column] = df[column].apply(lambda X: float(X.replace('%',''))/100)
This is the error I am getting:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Input In [127], in <cell line: 3>()
1 pct_columns = ['R_SIG_STR_pct', 'B_SIG_STR_pct', 'R_TD_pct', 'B_TD_pct']
3 for column in pct_columns:
----> 4 df[column] = df[column].apply(lambda X: float(X.replace('%',''))/100)
File ~/opt/anaconda3/envs/book_env/lib/python3.8/site-packages/pandas/core/series.py:4108, in Series.apply(self, func, convert_dtype, args, **kwds)
4106 else:
4107 values = self.astype(object)._values
-> 4108 mapped = lib.map_infer(values, f, convert=convert_dtype)
4110 if len(mapped) and isinstance(mapped[0], Series):
4111 # GH 25959 use pd.array instead of tolist
4112 # so extension arrays can be used
4113 return self._constructor_expanddim(pd_array(mapped), index=self.index)
File pandas/_libs/lib.pyx:2467, in pandas._libs.lib.map_infer()
Input In [127], in <lambda>(X)
1 pct_columns = ['R_SIG_STR_pct', 'B_SIG_STR_pct', 'R_TD_pct', 'B_TD_pct']
3 for column in pct_columns:
----> 4 df[column] = df[column].apply(lambda X: float(X.replace('%',''))/100)
ValueError: could not convert string to float: '---'
CodePudding user response:
You should focus on this exact problem:
ValueError: could not convert string to float: '---'
Two possible approaches would be:
- Remove lines containing the value
'---'
before doing the string to float conversion.
import pandas as pd
df = pd.DataFrame({
'A': ['foo', 'bar', 'foo', 'bar', 'foo', '---', 'foo', 'bar', 'foo', 'bar'],
'B': ['one', '---', 'two', 'three', 'two', 'two', 'one', 'three', 'one', 'three'],
'C': ['---', '25.84%', '---', '54.32%', '---', '42.73%', '---', '---', '---', '---']
})
df = df[~df.eq('---').any(1)]
- Replace
'---'
values to a value that can be converted to float.
import pandas as pd
df = pd.DataFrame({
'A': ['foo', 'bar', 'foo', 'bar', 'foo', '---', 'foo', 'bar', 'foo', 'bar'],
'B': ['one', '---', 'two', 'three', 'two', 'two', 'one', 'three', 'one', 'three'],
'C': ['---', '25.84%', '---', '54.32%', '---', '42.73%', '---', '---', '---', '---']
})
df.replace('---', '0.00%', inplace=True)
Of course you can also replace it to a float value like 0.00
for example, but it really depends on your need.