Home > Software design >  Extract a value from a JSON string stored in a pandas data frame column
Extract a value from a JSON string stored in a pandas data frame column

Time:12-04

I have a pandas dataframe with a column named json2 which contains a json string coming from an API call:

"{'obj': [{'timestp': '2022-12-03', 'followers': 281475, 'avg_likes_per_post': 7557, 'avg_comments_per_post': 182, 'avg_views_per_post': 57148, 'engagement_rate': 2.6848}, {'timestp': '2022-12-02', 'followers': 281475, 'avg_likes_per_post': 7557, 'avg_comments_per_post': 182, 'avg_views_per_post': 57148, 'engagement_rate': 2.6848}]}"

I want to make a function that iterates over the column and extracts the number of followers if the timestp matches with a given date

def get_followers(x):
    if x['obj']['timestp']=='2022-12-03':
        return x['obj']['followers']

df['date'] = df['json2'].apply(get_followers)

I should get 281475 as value in the column date but I got an error: "list indices must be integers or slices, not str"

What I'm doing wrong? Thank you in advance

CodePudding user response:

The key named obj occurs in list of dictionaries. Before you define another key, you must also specify the index of the list element.

import ast
df['json2']=df['json2'].apply(ast.literal_eval) #if dictionary's type is string, convert to dictionary.

def get_followers(x):
    if x['obj'][0]['timestp']=='2022-12-03':
        return x['obj'][0]['followers']

df['date'] = df['json2'].apply(get_followers)

Also you can use this too. This does the same job as the function you are using:

df['date'] = df['json2'].apply(lambda x: x['obj'][0]['followers'] if x['obj'][0]['timestp']=='2022-12-03' else None)

for list of dicts:

def get_followers(x):
    for i in x['obj']:
        if i['timestp'] == '2022-12-03':
            return i['followers']
            break
    
df['date'] = df['json2'].apply(get_followers)
  • Related