I have a CSV dataset with 2 columns that looks like the following:
Date | Open |
---|---|
25/2/21 | 7541.85 |
26/2/21 | 7562.32 |
27/2/21 | 7521.65 |
28/2/21 | 7509.14 |
Data columns (total 2 columns):
# | Column | Non-Null | Count | Dtype |
---|---|---|---|---|
0 | Open | 1280 | non-null | object |
1 | Date | 1280 | non-null | datetime64[ns] |
dtypes: datetime64ns, object(1)
When trying to pass this through a timeseries model I get the following error:
ftse_open = TimeSeries.from_dataframe(ftse_open, time_col='Date', value_cols='Open')
ValueError: could not convert string to float: '7,541.85'
Then I try a different route using the following code:
ftse_open["Open"] = ftse_open["Open"].astype('Int64')
Yielding:
TypeError: object cannot be converted to an IntegerDtype
I have tried more code to resolve but I'm not sure why there seems to be no solution that I can find.
(There are no NAs in the dataset - I have checked).
Any help is appreciated, thank you.
CodePudding user response:
Based on comments, you can try:
df["Open"] = df["Open"].str.replace(",", "").astype(float)
print(df)
Prints:
Date Open
0 25/2/21 7541.85
1 26/2/21 7562.32
2 27/2/21 7521.65
3 28/2/21 7509.14
df
used:
Date Open
0 25/2/21 7,541.85
1 26/2/21 7,562.32
2 27/2/21 7,521.65
3 28/2/21 7,509.14