Home > Net >  Parsing a Pandas column in JSON format
Parsing a Pandas column in JSON format

Time:09-10

I am parsing a Pandas column of type string that is in JSON format such as the following

kafka_data["MESSAGE_DATA__C"].iloc[0]
Out[20]: '{"userId":"af33f42e","trackingCategory":"ACTION","trackedItem":{"id":"PERSONAL_IDENTIFICATION_STARTED","category":"PERSONAL_IDENTIFICATION","title":"Personal Identification Started"}}'

When I parse a single row everything works

json.loads(kafka_data["MESSAGE_DATA__C"].iloc[0])
Out[25]: 
{'userId': 'af33f42e',
 'trackingCategory': 'ACTION',
 'trackedItem': {'id': 'PERSONAL_IDENTIFICATION_STARTED',
  'category': 'PERSONAL_IDENTIFICATION',
  'title': 'Personal Identification Started'}}

But when I try to parse altogether the column, an error prompts.

json.decoder.JSONDecodeError: Expecting ',' delimiter: line 1 column 212 (char 211)

Am I missing anything? I need to read this column into a new dataframe.

CodePudding user response:

When applying a function to entire column, use axis=1 parameter.

Please try:

kafka_data[["MESSAGE_DATA__C"]].apply(lambda row: json.loads(str(row["MESSAGE_DATA__C"])), axis=1)
  • Related