I am trying to parse a dataframe column called 'tags'
that contains a list of dicts and as an output create a list of the values of the key var1
:
Dataframe column 'tags' example value:
[{'var1': 'blue','var2': 123,'var3': 888},{'var1': 'red','var2': 123,'var3': 888},{'var1': 'green','var2': 123,'var3': 888}]
desired output:
['blue', 'red', 'green']
code:
d = [{f'{k}{i}': v for i, y in enumerate(x, 1) for k, v in y.items() if k == 'var1'} for x in df['tags']]
df = pd.DataFrame(d, index=df.index).sort_index(axis=1)
However this produces the following error:
Type Error: 'float' object is not iterable
I have tried converting both i
and v
to a string using str(i)
and str(v)
, however I am still getting the same error.
What am I doing wrong?
CodePudding user response:
Error obviously means some missing values instead lists, for avoid it add if-else
with empty lists in ouput if missing in tags
:
df['new'] = [[] if isinstance(x, float) else [y.get('var1') for y in x] for x in df['tags']]