I have a pandas dataframe that has one long row as a result of a flattened json list.
I want to go from the example:
{'0_id': 1, '0_name': a, '0_address': USA, '1_id': 2, '1_name': b, '1_address': UK, '1_hobby': ski}
to a table like the following:
id | name | address | hobby |
---|---|---|---|
1 | a | USA | |
2 | b | UK | ski |
Any help is greatly appreciated :)
CodePudding user response:
There you go:
import json
json_data = '{"0_id": 1, "0_name": "a", "0_address": "USA", "1_id": 2, "1_name": "b", "1_address": "UK", "1_hobby": "ski"}'
arr = json.loads(json_data)
result = {}
for k in arr:
kk = k.split("_")
if int(kk[0]) not in result:
result[int(kk[0])] = {"id":"", "name":"", "hobby":""}
result[int(kk[0])][kk[1]] = arr[k]
for key in result:
print("%s %s %s" % (key, result[key]["name"], result[key]["address"]))
if you want to have field more dynamic, you have two choices - either go through all array and gather all possible names and then build template associated empty array, or just check if key exist in result when you returning results :)
CodePudding user response:
This way only works if every column follows this pattern, but should otherwise be pretty robust.
data = {'0_id': '1', '0_name': 'a', '0_address': 'USA', '1_id': '2', '1_name': 'b', '1_address': 'UK', '1_hobby': 'ski'}
df = pd.DataFrame(data, index=[0])
indexes = set(x.split('_')[0] for x in df.columns)
to_concat = []
for i in indexes:
target_columns = [col for col in df.columns if col.startswith(i)]
df_slice = df[target_columns]
df_slice.columns = [x.split('_')[1] for x in df_slice.columns]
to_concat.append(df_slice)
new_df = pd.concat(to_concat)