How do I get answerId
into a separate column in a pandas dataframe?
0 {'answerText': {'es': 'No'}, 'answerId': 'Q2A2'}
1 {'answerText': {'es': 'No'}, 'answerId': 'Q2A2'}
2 {'answerText': {'es': 'Sí'}, 'answerId': 'Q2A1, 'freetextAnswer': 'Parancetamol 1g.',
'includeFreeText': True}
3 {'answerText': {'es': 'No'}, 'answerId': 'Q2A2'}
4 {'answerText': {'es': 'No'}, 'answerId': 'Q2A2'}
as a df now looks like this:
responses1_answer
0 {'answerText': {'es': 'No'}, 'answerId': 'Q2A2'}
1 {'answerText': {'es': 'No'}, 'answerId': 'Q2A2'}
2 {'answerText': {'es': 'Sí'}, 'answerId': 'Q2A1...
3 {'answerText': {'es': 'No'}, 'answerId': 'Q2A2'}
4 {'answerText': {'es': 'No'}, 'answerId': 'Q2A2'}
I tried with json_normalise
but I get the answers Q2A2 and so on as a column instead. Any help would be highly appreciated!
Instead the output I want is a dataframe where answerId is in a separate column like this:
answerId
Q2A2
Q2A2
Q2A1
I also tried:
variables = df[0].keys()
df1 = pd.DataFrame([[getattr(i,j) for j in variables] for i in df], columns = variables)
but I get: AttributeError: 'dict' object has no attribute 'answerText'
CodePudding user response:
Assuming lst
is your list of dicts, you can do:
pd.DataFrame(data=[d['answerId'] for d in lst], columns=['answerId'])
CodePudding user response:
Your comment:
it is a pandas series. I will post it in my original question
Then you can use the apply() function. Something like this should works. (However i did not try it out as i don't have your original data)
new_series = original_series.apply(lambda d: d['answerId'])
CodePudding user response:
Get values by key answerId
with str
, it return NaN
if no match:
print (df)
responses1_answer
0 {'answerText': {'es': 'No'}, 'answerId': 'Q2A2'}
1 {'answerText': {'es': 'No'}, 'answerId': 'Q2A2'}
2 {'answerText': {'es': 'Sí'}, 'answerId': 'Q2A1'}
3 {'answerText': {'es': 'No'}, 'answerId': 'Q2A2'}
4 {'answerText': {'es': 'No'}, 'answerId': 'Q2A2'}
s = df['responses1_answer'].str['answerId']
print (s)
0 Q2A2
1 Q2A2
2 Q2A1
3 Q2A2
4 Q2A2
Name: responses1_answer, dtype: object
df1 = pd.json_normalize(df['responses1_answer'])
print (df1)
answerId answerText.es
0 Q2A2 No
1 Q2A2 No
2 Q2A1 Sí
3 Q2A2 No
4 Q2A2 No