I pulled a json file from web, a list of browser details. I am trying to create separate columns of version_list but for some reason it is not putting it correctly.
Currently this is how the dataframe looks like.
Browser | Version_list |
---|---|
IE | [{'version': '5.5', 'global_usage': 0.009298, 'release_date': 962323200, 'era': -6, 'prefix': ''}, {'version': '6', 'global_usage': 0.0131217, 'release_date': 998870400, 'era': -5, 'prefix': ''}, {'version': '7', 'global_usage': 0.00621152, 'release_date': 1161129600, 'era': -4, 'prefix': ''}, {'version': '8', 'global_usage': 0.0395479, 'release_date': 1237420800, 'era': -3, 'prefix': ''}, {'version': '9', 'global_usage': 0.0593219, 'release_date': 1300060800, 'era': -2, 'prefix': ''}, {'version': '10', 'global_usage': 0.0263653, 'release_date': 1346716800, 'era': -1, 'prefix': ''}, {'version': '11', 'global_usage': 0.619584, 'release_date': 1381968000, 'era': 0, 'prefix': ''}] |
Edge | [{'version': '12', 'global_usage': 0.008322, 'release_date': 1438128000, 'era': -26, 'prefix': 'ms'}, {'version': '13', 'global_usage': 0.004267, 'release_date': 1447286400, 'era': -25, 'prefix': 'ms'}, {'version': '14', 'global_usage': 0.004268, 'release_date': 1470096000, 'era': -24, 'prefix': 'ms'}, {'version': '15', 'global_usage': 0.012483, 'release_date': 1491868800, 'era': -23, 'prefix': 'ms'}, {'version': '16', 'global_usage': 0.004161, 'release_date': 1508198400, 'era': -22, 'prefix': 'ms'}, {'version': '17', 'global_usage': 0.012483, 'release_date': 1525046400, 'era': -21, 'prefix': 'ms'}, {'version': '18', 'global_usage': 0.074898, 'release_date': 1542067200, 'era': -20, 'prefix': 'ms'}, {'version': '79', 'global_usage': 0, 'release_date': 1579046400, 'era': -19, 'prefix': ''}, {'version': '80', 'global_usage': 0.004298, 'release_date': 1581033600, 'era': -18, 'prefix': ''}, {'version': '81', 'global_usage': 0.00944, 'release_date': 1586736000, 'era': -17, 'prefix': ''}, {'version': '83', 'global_usage': 0.004043, 'release_date': 1590019200, 'era': -16, 'prefix': ''}, {'version': '84', 'global_usage': 0.004161, 'release_date': 1594857600, 'era': -15, 'prefix': ''}, {'version': '85', 'global_usage': 0.008322, 'release_date': 1598486400, 'era': -14, 'prefix': ''}, {'version': '86', 'global_usage': 0.004161, 'release_date': 1602201600, 'era': -13, 'prefix': ''}, {'version': '87', 'global_usage': 0.008322, 'release_date': 1605830400, 'era': -12, 'prefix': ''}, {'version': '88', 'global_usage': 0.004318, 'release_date': 1611360000, 'era': -11, 'prefix': ''}, {'version': '89', 'global_usage': 0.008322, 'release_date': 1614816000, 'era': -10, 'prefix': ''}, {'version': '90', 'global_usage': 0.004161, 'release_date': 1618358400, 'era': -9, 'prefix': ''}, {'version': '91', 'global_usage': 0.004161, 'release_date': 1622073600, 'era': -8, 'prefix': ''}, {'version': '92', 'global_usage': 0.012483, 'release_date': 1626912000, 'era': -7, 'prefix': ''}, {'version': '93', 'global_usage': 0.004161, 'release_date': 1630627200, 'era': -6, 'prefix': ''}, {'version': '94', 'global_usage': 0.008322, 'release_date': 1632441600, 'era': -5, 'prefix': ''}, {'version': '95', 'global_usage': 0.029127, 'release_date': 1634774400, 'era': -4, 'prefix': ''}, {'version': '96', 'global_usage': 0.049932, 'release_date': 1637539200, 'era': -3, 'prefix': ''}, {'version': '97', 'global_usage': 0.853005, 'release_date': 1641427200, 'era': -2, 'prefix': ''}, {'version': '98', 'global_usage': 2.98344, 'release_date': 1643932800, 'era': -1, 'prefix': ''}, {'version': '99', 'global_usage': 0, 'release_date': 1646265600, 'era': 0, 'prefix': ''}] |
The version_list is too nested, I have tried below code to separate it different column but it is not working
df = df["version_list"].apply(pd.Series)
CodePudding user response:
You can explode
on Version_list
column to convert list of dict to rows. Then apply pd.Series
to turn dict row to columns. At last, concat
them with original Browser
column.
df = df.explode('Version_list')
df_ = pd.concat([df.drop(['Version_list'], axis=1), df['Version_list'].apply(pd.Series)], axis=1)
print(df_)
Browser version global_usage release_date era prefix
0 IE 5.5 0.009298 962323200 -6
0 IE 6 0.013122 998870400 -5
0 IE 7 0.006212 1161129600 -4
0 IE 8 0.039548 1237420800 -3
0 IE 9 0.059322 1300060800 -2
0 IE 10 0.026365 1346716800 -1
0 IE 11 0.619584 1381968000 0
1 Edge 12 0.008322 1438128000 -26 ms
1 Edge 13 0.004267 1447286400 -25 ms
1 Edge 14 0.004268 1470096000 -24 ms
1 Edge 15 0.012483 1491868800 -23 ms
1 Edge 16 0.004161 1508198400 -22 ms
1 Edge 17 0.012483 1525046400 -21 ms
1 Edge 18 0.074898 1542067200 -20 ms
1 Edge 79 0.000000 1579046400 -19
1 Edge 80 0.004298 1581033600 -18
1 Edge 81 0.009440 1586736000 -17
1 Edge 83 0.004043 1590019200 -16
1 Edge 84 0.004161 1594857600 -15
1 Edge 85 0.008322 1598486400 -14
1 Edge 86 0.004161 1602201600 -13
1 Edge 87 0.008322 1605830400 -12
1 Edge 88 0.004318 1611360000 -11
1 Edge 89 0.008322 1614816000 -10
1 Edge 90 0.004161 1618358400 -9
1 Edge 91 0.004161 1622073600 -8
1 Edge 92 0.012483 1626912000 -7
1 Edge 93 0.004161 1630627200 -6
1 Edge 94 0.008322 1632441600 -5
1 Edge 95 0.029127 1634774400 -4
1 Edge 96 0.049932 1637539200 -3
1 Edge 97 0.853005 1641427200 -2
1 Edge 98 2.983440 1643932800 -1
1 Edge 99 0.000000 1646265600 0