I am trying to analyze Futures data using a data file I downloaded from Ninja Trader. I imported the file in Python using PyCharm IDE.
The text file format is this:
20211031 220000 0000000;4608;4608;4608;1
20211031 220000 0000000;4608;4608;4608;4
20211031 220000 0000000;4608;4608;4608;1
20211031 220000 0000000;4608;4608;4608;1
These are tick data with timestamp, price, price, price, volume.
I imported them using:
data = pd.read_csv(r"C:\__dir__test.txt",sep=';')
(I have deleted the actual dir address)
Assigned column names:
data.columns = ["date","P1","P2","P3","V"]
Then I used the pd.to_datetime unsuccessfully:
data["date"] = pd.to_datetime(data ["date"])
and
data['Dates'] = pd.to_datetime(data['date']).dt.date
data['Time'] = pd.to_datetime(data['date']).dt.time
Then I am getting an error:
File "C:\..dir..venv\lib\site-packages\pandas\core\arrays\datetimes.py", line 2187, in objects_to_datetime64ns
values, tz_parsed = conversion.datetime_to_datetime64(data.ravel("K"))
File "pandas\_libs\tslibs\conversion.pyx", line 359, in pandas._libs.tslibs.conversion.datetime_to_datetime64
TypeError: Unrecognized value type: <class 'str'>
During handling of the above exception, another exception occurred: Traceback (most recent call last):
File "<input>", line 1, in <module>
File "C:\Program Files\JetBrains\PyCharm Community Edition 2021.2.1\plugins\python-ce\helpers\pydev\_pydev_bundle\pydev_umd.py", line 198, in runfile
pydev_imports.execfile(filename, global_vars, local_vars) # execute the script
File "C:\Program Files\JetBrains\PyCharm Community Edition 2021.2.1\plugins\python-ce\helpers\pydev\_pydev_imps\_pydev_execfile.py", line 18, in execfile
exec(compile(contents "\n", file, 'exec'), glob, loc)
File "C:/__dir__/Data from NT.py", line 10, in <module>
data["date"] = pd.to_datetime(data ["date"])
File "C:\__dir__\venv\lib\site-packages\pandas\core\tools\datetimes.py", line 883, in to_datetime
cache_array = _maybe_cache(arg, format, cache, convert_listlike)
File "C:\__dir__\venv\lib\site-packages\pandas\core\tools\datetimes.py", line 195, in _maybe_cache
cache_dates = convert_listlike(unique_dates, format)
File "C:\__dir__\venv\lib\site-packages\pandas\core\tools\datetimes.py", line 401, in _convert_listlike_datetimes
result, tz_parsed = objects_to_datetime64ns(
File "C:\__dir__\venv\lib\site-packages\pandas\core\arrays\datetimes.py", line 2193, in objects_to_datetime64ns
raise err
File "C:\__dir__\venv\lib\site-packages\pandas\core\arrays\datetimes.py", line 2175, in objects_to_datetime64ns result, tz_parsed = tslib.array_to_datetime(
File "pandas\_libs\tslib.pyx", line 379, in pandas._libs.tslib.array_to_datetime
File "pandas\_libs\tslib.pyx", line 611, in pandas._libs.tslib.array_to_datetime
File "pandas\_libs\tslib.pyx", line 749, in pandas._libs.tslib._array_to_datetime_object
File "pandas\_libs\tslib.pyx", line 740, in pandas._libs.tslib._array_to_datetime_object
File "pandas\_libs\tslibs\parsing.pyx", line 257, in pandas._libs.tslibs.parsing.parse_datetime_string
File "C:\__dir__\venv\lib\site-packages\dateutil\parser\_parser.py", line 1368, in parse
return DEFAULTPARSER.parse(timestr, **kwargs)
File "C:\__dir__\venv\lib\site-packages\dateutil\parser\_parser.py", line 643, in parse
raise ParserError("Unknown string format: %s", timestr)
dateutil.parser._parser.ParserError: Unknown string format: 20211031 220000 0000000
The purpose is to split date and time to do analysis in specific time intervals (by slicing the data). I used the same set up in a data set I downloaded from Interactive Brokers using the ib_insync framework and worked fine.
CodePudding user response:
import pandas as pd
df = pd.DataFrame({"date_raw":["20211031 220000 0000000","20211031 220000 0000000", "20211031 220000 0000000", "20211031 220000 0000000"]})
# parse date_raw to datetime field
df["date"] = pd.to_datetime(df["date_raw"],format="%Y%m%d %H%M%S %f")
# get Date and Time fields
df['Date'] = df['date'].dt.date
df['Time'] = df['date'].dt.time
print(df)
Result:
date_raw date Date Time
0 20211031 220000 0000000 2021-10-31 22:00:00 2021-10-31 22:00:00
1 20211031 220000 0000000 2021-10-31 22:00:00 2021-10-31 22:00:00
2 20211031 220000 0000000 2021-10-31 22:00:00 2021-10-31 22:00:00
3 20211031 220000 0000000 2021-10-31 22:00:00 2021-10-31 22:00:00