I have a dataframe with two columns, "value1, value2" for example.
Both are strings (ex: '100,50'
and '100,10'
), but there is a part in my code that i have to check if the difference between the two are less or equal to 1.0.
df = df.assign(difference = df['value1'] - df['value2'])
new_df = df [(df.difference <= 1.0)]
There is ways to convert string into float type but there are not working.
I tried using pandas.astype(float) and locale.atof in a loop. But the terminal says: ValueError: could not convert string to float: '482,43'
With
pds.to_numeric(valor_false_df['VR CMS'], downcast="float")
it says ValueError: Unable to parse string "482,43" at position 0.
How can i make this subtraction? Where am i wrong?
CodePudding user response:
I believe pandas is interpreting the comma as a comma, and not as the decimal delimiter. If you replace the comma with a period and then cast as float, it should work (from a similar StackOverflow answer):
df['value1'] = df['value1'].str.replace(',', '.').astype(float)
df['value2'] = df['value2'].str.replace(',', '.').astype(float)
df = df.assign(difference = df['value1'] - df['value2'])
new_df = df [(df.difference <= 1.0)]