Hello i have the csv file below, imported with pandas data = pd.read_csv("1.csv")
:
x1,x2,xb,y
−2,1,1,1
I need to convert the negative number (-2) to integer with int()
, but i get ValueError
:
print(data.iloc[1-1]['x1'])
> -2 # str
print(int(data.iloc[1-1]['x1']))
> ValueError: invalid literal for int() with base 10: '−2`
I haven't the error when try to convert positive number:
print(data.iloc[1-1]['x2'])
> 1 # str
print(int(data.iloc[1-1]['x2']))
> 1 # int
CodePudding user response:
The "−" within "−2" is not a proper minus sign, looks like it but is not the same.
Your print would work like this:
print(int(data.iloc[1-1]['x1'].replace("−", "-")))
And if you don't want to replace the problematic minus signs with the correct ones one by one, you could do this operation on the whole column.
data['x1'] = data['x1'].str.replace("−", "-").astype("int")
CodePudding user response:
The problem is that many unicode characters look like a minus sign...
The character that you are showing in your question is U 2212 MINUS SIGN. The character that is used for negative numbers is the ASCII U 002D HYPHEN-MINUS. While the print the same, they are different characters. You will have to clean up your data file...