How to split one row into multiple rows in python-CodePudding

I have a pandas dataframe that has one long row as a result of a flattened json list. I want to go from the example: {'0_id': 1, '0_name': a, '0_address': USA, '1_id': 2, '1_name': b, '1_address': UK, '1_hobby': ski} to a table like the following:

id	name	address	hobby
1	a	USA
2	b	UK	ski

Any help is greatly appreciated :)

CodePudding user response：

There you go:

import json


json_data = '{"0_id": 1, "0_name": "a", "0_address": "USA", "1_id": 2, "1_name": "b", "1_address": "UK", "1_hobby": "ski"}'
arr = json.loads(json_data)

result = {}


for k in arr:
    kk = k.split("_") 
    if int(kk[0]) not in result: 
        result[int(kk[0])] =  {"id":"", "name":"", "hobby":""}
    result[int(kk[0])][kk[1]] = arr[k]


for key in result:
    print("%s %s %s" % (key, result[key]["name"], result[key]["address"]))

if you want to have field more dynamic, you have two choices - either go through all array and gather all possible names and then build template associated empty array, or just check if key exist in result when you returning results :)

CodePudding user response：

This way only works if every column follows this pattern, but should otherwise be pretty robust.

data = {'0_id': '1', '0_name': 'a', '0_address': 'USA', '1_id': '2', '1_name': 'b', '1_address': 'UK', '1_hobby': 'ski'}

df = pd.DataFrame(data, index=[0])

indexes = set(x.split('_')[0] for x in df.columns)

to_concat = []
for i in indexes:
    target_columns = [col for col in df.columns if col.startswith(i)]
    df_slice = df[target_columns]
    df_slice.columns = [x.split('_')[1] for x in df_slice.columns]

    to_concat.append(df_slice)

new_df = pd.concat(to_concat)