I'm using the fields parameter on the python-elasticsearch api to retrieve some data from elasticsearch trying to parse the @timestamp in iso format, for use in a pandas dataframe.
fields = \
[{
"field": "@timestamp",
"format": "strict_date_optional_time"
}]
By default elasticsearch return the results on array-list format as seen in doc:
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-fields.html
The fields response always returns an array of values for each field, even when there is a single value in the _source.
Due to this the resulting dataframe contains a object-list serie that can't be parsed to a datetime serie by conventional methods.
Name: fields.@timestamp, Length: 18707, dtype: object
0 [2021-11-04T01:30:00.263Z]
1 [2021-11-04T01:30:00.385Z]
2 [2021-11-04T01:30:00.406Z]
3 [2021-11-04T01:30:00.996Z]
4 [2021-11-04T01:30:01.001Z]
...
8368 [2021-11-04T02:00:00.846Z]
8369 [2021-11-04T02:00:00.894Z]
8370 [2021-11-04T02:00:00.895Z]
8371 [2021-11-04T02:00:00.984Z]
8372 [2021-11-04T02:00:00.988Z]
When trying to parse the serie to datetime serie:
pd.to_datetime(["fields.@timestamp"])
That result in:
TypeError: <class 'list'> is not convertible to datetime
My use case requires lot of datetime formats and fields parameter suits very well querying multiple in formats, but the on listed object datetime string difficult the things.
CodePudding user response:
The issue here is that items of fields.@timestamp are actually lists.
So you could do :
fields['timestamp'] = fields['timestamp'].str[0]
to extract the date from the list, and then use pd.to_datetime :
fields['timestamp'] = pd.to_datetime(fields['timestamp'])