So I have a large csv file with lots of data. The main column 'Results', that I am interested in has integers, float, NaN data types and also number as text. I need to aggregate 'Results' but before I do I want to convert the column to float data type. The values that are text have trailing spaces like the following: ["1.07 ", "8.22 ", "8.6 ", "11.41 ", "7.93 "]
The error I get is...
AttributeError: Can only use .str accessor with string values!
import pandas as pd
import os
import numpy as np
csv_file = 'c:/path/to/file/big.csv'
# ... more lines of code ...
df = pd.read_csv(csv_file, usecols=my_cols, parse_dates=['Date'])
df = df[df['Company ID'].str.contains(my_company)]
print('df of csv created')
# Above code works great.
# the below 2 tries did not work for me
# df['Result'] = pd.to_numeric(df['Result'].str.replace(' ', ''), errors='ignore')
# df['Result'] = df['Result'].str.strip() # causes an error
# now let's try np.where...
# the below causes AttributeError: Can only use .str accessor with string values!
df['Result'] = np.where(df['Result'].dtype == np.str, df['Result'].str.strip(),
df['Result'])
df['Result'] = pd.to_numeric(df['Result'], downcast="float", errors='raise')
How should I resolve this?
CodePudding user response:
Why don't you try this code to explicitly convert all the value as stirng using astype(str)
.
df['Result'] = df['Result'].astype(str).strip()
Sometime, I use this code if NaN or numbers are included in a Series to avoid getting the error msg.