Home > Enterprise >  Python - Pandas CSV file with error when converting mixed data types string to number
Python - Pandas CSV file with error when converting mixed data types string to number

Time:02-12

So I have a large csv file with lots of data. The main column 'Results', that I am interested in has integers, float, NaN data types and also number as text. I need to aggregate 'Results' but before I do I want to convert the column to float data type. The values that are text have trailing spaces like the following: ["1.07 ", "8.22 ", "8.6 ", "11.41 ", "7.93 "]

The error I get is...

AttributeError: Can only use .str accessor with string values!

import pandas as pd
import os
import numpy as np

csv_file = 'c:/path/to/file/big.csv'
# ... more lines of code ...

df = pd.read_csv(csv_file, usecols=my_cols, parse_dates=['Date'])
df = df[df['Company ID'].str.contains(my_company)]
print('df of csv created')
# Above code works great. 

# the below 2 tries did not work for me
# df['Result'] = pd.to_numeric(df['Result'].str.replace(' ', ''), errors='ignore')
# df['Result'] = df['Result'].str.strip() # causes an error 

# now let's try np.where...
# the below causes AttributeError: Can only use .str accessor with string values! 
df['Result'] = np.where(df['Result'].dtype == np.str, df['Result'].str.strip(), 
df['Result'])
df['Result'] = pd.to_numeric(df['Result'], downcast="float", errors='raise')

How should I resolve this?

CodePudding user response:

Why don't you try this code to explicitly convert all the value as stirng using astype(str).

df['Result'] = df['Result'].astype(str).strip()

Sometime, I use this code if NaN or numbers are included in a Series to avoid getting the error msg.

  • Related