Home > other >  Python- problem converting negative numbers to floats, issues with hyphen encoding
Python- problem converting negative numbers to floats, issues with hyphen encoding

Time:10-27

I have a Pandas dataframe that I've read from a file - pd.read_csv() - and I'm having trouble converting a column with string values to float.

Firstly, I'm not entirely sure why pandas is even reading the column as string files to begin with - all the values are numeric. The problem seems to be with the hyphen minus sign for the negative numbers. There are other threads on this topic that mention how em-dash can mess things up (here, for example)

However, when I try converting the hyphen type, it still gives me an error. For example,

df['Verified_m'] = df['Verified_m'].str.replace("\U00002013", "-").astype(float)

doesn't change anything; all the values start with the '-' hyphen, so it's not actually replacing anything. It still gives me the error:

ValueError: could not convert string to float: '-'

I've tried replacing all of the hyphens with a numeric value to see if that would work, and I'm able to convert to float (example: df['Verified_m'] = df['Verified_m'].str.replace("-", "0").astype(float) . But I'd like to retain the negative values in the dataset. Does anyone know what's wrong with my hyphens?

CodePudding user response:

Try this:

df['Verified_m'] = df['Verified_m'].str.replace("\U00002013", "-").str.replace(r'^-$', '0', regex=True).astype(float)

After converting the em-dashes to hyphens, it converts a lone - to zero.

  • Related