So I have a pandas dataFrame:
import pandas as pd
lead_comments = [{'lead_id':1, 'note_id': 1, 'params': {'text': 'first_comment'}}, {'lead_id':1, 'note_id': 2, 'params': {'text': 'second_comment'}}, {'lead_id':1, 'note_id': 3, 'params': {'text': 'third_comment'}}]
df = pd.DataFrame(lead_comments)
print df
Output:
lead_id note_id params
0 1 1 {'text': 'first_comment'}
1 1 2 {'text': 'second_comment'}
2 1 3 {'text': 'third_comment'}
I`m trying to extract a 'text' value from df['Params'], the result I want looks like:
lead_id note_id params
0 1 1 first_comment
1 1 3 second_comment
2 1 3 third_comment
i've tried to use pandas.apply()
like
def get_comment_text(x):
for value in x["params"]["text"]:
return value
df['comment_text'] = df.apply(get_comment_text, axis=1)
but it gave my a wrong output:
lead_id note_id params
0 1 1 f
1 1 3 s
2 1 3 t
Where did I go wrong and how to get right values? Thanks!
CodePudding user response:
Just use assign
and str
accessor __getitem__
:
>>> df.assign(params=df['params'].str['text'])
lead_id note_id params
0 1 1 first_comment
1 1 2 second_comment
2 1 3 third_comment
>>>
It looks like you are still using Python 2... Switch to Python 3!
If your dataframe dictionaries are string, before the code use:
import ast
df["params"] = df["params"].apply(ast.literal_eval)
CodePudding user response:
Use Series.str.get
:
#if dictionaries are strings
#import ast
#df["params"] = df["params"].apply(ast.literal_eval)
df['comment_text'] = df["params"].str.get("text")
CodePudding user response:
You can use apply
and lambda
(this approach is slower):
>>> df['params'].apply(lambda x: x.get('text'))
0 first_comment
1 second_comment
2 third_comment
Name: params, dtype: object
CodePudding user response:
You can also use:
df['params'] = df['params'].apply(pd.Series)
Output:
lead_id note_id params
0 1 1 first_comment
1 1 2 second_comment
2 1 3 third_comment