I am parsing a Pandas column of type string that is in JSON format such as the following
kafka_data["MESSAGE_DATA__C"].iloc[0]
Out[20]: '{"userId":"af33f42e","trackingCategory":"ACTION","trackedItem":{"id":"PERSONAL_IDENTIFICATION_STARTED","category":"PERSONAL_IDENTIFICATION","title":"Personal Identification Started"}}'
When I parse a single row everything works
json.loads(kafka_data["MESSAGE_DATA__C"].iloc[0])
Out[25]:
{'userId': 'af33f42e',
'trackingCategory': 'ACTION',
'trackedItem': {'id': 'PERSONAL_IDENTIFICATION_STARTED',
'category': 'PERSONAL_IDENTIFICATION',
'title': 'Personal Identification Started'}}
But when I try to parse altogether the column, an error prompts.
json.decoder.JSONDecodeError: Expecting ',' delimiter: line 1 column 212 (char 211)
Am I missing anything? I need to read this column into a new dataframe.
CodePudding user response:
When applying a function to entire column, use axis=1
parameter.
Please try:
kafka_data[["MESSAGE_DATA__C"]].apply(lambda row: json.loads(str(row["MESSAGE_DATA__C"])), axis=1)