Home > database >  How to split dictionary column in dataframe and make a new columns for each key values
How to split dictionary column in dataframe and make a new columns for each key values

Time:03-25

I have a dataframe which has a column containing multiple values, separated by ",".

id data
0   {'1':A, '2':B, '3':C}
1   {'1':A}
2   {'0':0}

How can I split up the keys-values of 'data' column and make a new column for each key values present in it, without removing the original 'data' column.

desired output.

id data                   1   2   3   0
0   {'1':A, '2':B, '3':C} A   B   C   Nan
1   {'1':A}               A   Nan Nan Nan
2   {'0':0}               Nan Nan Nan 0

Thank you in advance :).

CodePudding user response:

You'll need a regular expression to convert the data into a format that can be parsed as JSON. Then, pd.json_normalize will do the job nicely:

df['data'] = df['data'].str.replace(r'(["\'])\s*:(. ?)\s*(,?\s*["\'}])', '\\1:\'\\2\'\\3', regex=True)

import ast
df['data'] = df['data'].apply(ast.literal_eval)

df = pd.concat([df, pd.json_normalize(df['data'])], axis=1)

Output:

>>> df
                             data    1    2    3    0
0  {'1': 'A', '2': 'B', '3': 'C'}    A    B    C  NaN
1                      {'1': 'A'}    A  NaN  NaN  NaN
2                      {'0': '0'}  NaN  NaN  NaN    0
  • Related