Home > other >  Convert a dataframe column (coded in dictionary format) to another dataframe
Convert a dataframe column (coded in dictionary format) to another dataframe

Time:10-05

In a pandas dataframe,

How do I convert a dataframe with a column coded in dictionary format

id data
1 [{'name': 'aaa', 'clusterName': 'AAA'}, {'name': 'bbb', 'clusterName': 'BBB'}]
2 [{'name': 'ccc', 'clusterName': 'CCC'}, {'name': 'ddd', 'clusterName': 'DDD'}]
3 [{'name': 'ccc', 'clusterName': 'CCC'}]

To this?

id name clusterName
1 aaa AAA
1 bbb BBB
2 ccc CCC
2 ddd DDD
3 ccc CCC

Thanks very much.

CodePudding user response:

Use DataFrame.explode with json_normalize:

import ast
#if necessary
#df['data'] = df['data'].apply(ast.literal_eval)

df1 = df.explode('data').reset_index(drop=True)
df1 = df1.join(pd.json_normalize(df1.pop('data')))
print (df1)
   id name clusterName
0   1  aaa         AAA
1   1  bbb         BBB
2   2  ccc         CCC
3   2  ddd         DDD
4   3  ccc         CCC

Another solution:

df1 = pd.DataFrame([{**{'id':a}, **x} for a, b in zip(df['id'], df['data']) for x in b])
print (df1)
   id name clusterName
0   1  aaa         AAA
1   1  bbb         BBB
2   2  ccc         CCC
3   2  ddd         DDD
4   3  ccc         CCC

CodePudding user response:

Rudimentary Approach:

data = [
[{'name': 'aaa', 'clusterName': 'AAA'}, {'name': 'bbb', 'clusterName': 'BBB'}],
[{'name': 'ccc', 'clusterName': 'CCC'}, {'name': 'ddd', 'clusterName': 'DDD'}],
[{'name': 'ccc', 'clusterName': 'CCC'}]
]

newArr = []
for lists in data:
    for dicts in lists:
        newArr.append(dicts)
        
import pandas as pd
df = pd.DataFrame(newArr)

The df variable matches the output as the Answer above as well

CodePudding user response:

import itertools
import pandas as pd

data = [
[{'name': 'aaa', 'clusterName': 'AAA'}, {'name': 'bbb', 'clusterName': 'BBB'}],
[{'name': 'ccc', 'clusterName': 'CCC'}, {'name': 'ddd', 'clusterName': 'DDD'}],
[{'name': 'ccc', 'clusterName': 'CCC'}]
]

pd.DataFrame(itertools.chain(*data))
  • Related