Home > Software engineering >  How to get a value from Pandas DataFrame column with dictionary
How to get a value from Pandas DataFrame column with dictionary

Time:10-13

So I have a pandas dataFrame:

import pandas as pd
lead_comments = [{'lead_id':1, 'note_id': 1, 'params': {'text': 'first_comment'}}, {'lead_id':1, 'note_id': 2, 'params': {'text': 'second_comment'}}, {'lead_id':1, 'note_id': 3, 'params': {'text': 'third_comment'}}]
df = pd.DataFrame(lead_comments)
print df

Output:

   lead_id  note_id   params
0  1        1         {'text': 'first_comment'}
1  1        2         {'text': 'second_comment'}
2  1        3         {'text': 'third_comment'}

I`m trying to extract a 'text' value from df['Params'], the result I want looks like:

   lead_id  note_id   params
0  1        1         first_comment
1  1        3         second_comment
2  1        3         third_comment

i've tried to use pandas.apply() like

def get_comment_text(x):
    for value in x["params"]["text"]:
        return value

df['comment_text'] = df.apply(get_comment_text, axis=1)

but it gave my a wrong output:

   lead_id  note_id   params
0  1        1         f
1  1        3         s
2  1        3         t

Where did I go wrong and how to get right values? Thanks!

CodePudding user response:

Just use assign and str accessor __getitem__:

>>> df.assign(params=df['params'].str['text'])
   lead_id  note_id          params
0        1        1   first_comment
1        1        2  second_comment
2        1        3   third_comment
>>> 

It looks like you are still using Python 2... Switch to Python 3!

If your dataframe dictionaries are string, before the code use:

import ast
df["params"] = df["params"].apply(ast.literal_eval)

CodePudding user response:

Use Series.str.get:

#if dictionaries are strings
#import ast

#df["params"] = df["params"].apply(ast.literal_eval)

df['comment_text'] = df["params"].str.get("text")

CodePudding user response:

You can use apply and lambda (this approach is slower):

>>> df['params'].apply(lambda x: x.get('text'))
0     first_comment
1    second_comment
2     third_comment
Name: params, dtype: object

CodePudding user response:

You can also use:

df['params'] = df['params'].apply(pd.Series)

Output:

  lead_id  note_id          params
0        1        1   first_comment
1        1        2  second_comment
2        1        3   third_comment
  • Related