I have a category list that already been flattened but not sure how I can convert it to table that looks like below example. So if a has children is equal to false then that will be my id and then if has children is true I'll get the name and so on. I think it can be done by looping the category list
"response": {
"category_list": [
{
"category_id": 100017,
"parent_category_id": 0,
"name": "Women Clothes",
"has_children": true
},
{
"category_id": 100118,
"parent_category_id": 100017,
"name": "Socks & Stockings",
"has_children": true
},
{
"category_id": 100419,
"parent_category_id": 100118,
"name": "Others",
"has_children": false
},
{
"category_id": 100418,
"parent_category_id": 100118,
"name": "Pantyhose",
"has_children": false
},
{
"category_id": 100417,
"parent_category_id": 100118,
"name": "Socks",
"has_children": false
}
]
}
So result should be like this:
Id | Category 1 | Category 2 | Category 3 | Category 4 |
---|---|---|---|---|
100419 | Women Clothes | Socks & Stockings | Others | |
100417 | Women Clothes | Socks & Stockings | Pantyhose | |
100418 | Women Clothes | Socks & Stockings | Socks |
CodePudding user response:
There's a great deal you haven't showed us. Your sample data all has exactly two parents. Is that universal? Will there be some with one parent? Three?
This produces the results you want and should handle arbitrary ancestry.
import json
from pprint import pprint
rawinp = """\
{
"response": {
"category_list": [
{
"category_id": 100017,
"parent_category_id": 0,
"name": "Women Clothes",
"has_children": true
},
{
"category_id": 100118,
"parent_category_id": 100017,
"name": "Socks & Stockings",
"has_children": true
},
{
"category_id": 100419,
"parent_category_id": 100118,
"name": "Others",
"has_children": false
},
{
"category_id": 100418,
"parent_category_id": 100118,
"name": "Pantyhose",
"has_children": false
},
{
"category_id": 100417,
"parent_category_id": 100118,
"name": "Socks",
"has_children": false
}
]
}
}"""
data = json.loads(rawinp)
parents = {}
rows = []
for row in data["response"]["category_list"]:
if row["has_children"]:
if row["parent_category_id"]:
parents[row["category_id"]] = parents[row["parent_category_id"]] [row["name"]]
else:
parents[row["category_id"]] = [row["name"]]
else:
rows.append([row["category_id"]] parents[row["parent_category_id"]] [row['name']])
pprint(rows)
Output:
[[100419, 'Women Clothes', 'Socks & Stockings', 'Others'],
[100418, 'Women Clothes', 'Socks & Stockings', 'Pantyhose'],
[100417, 'Women Clothes', 'Socks & Stockings', 'Socks']]
CodePudding user response:
Here's an incredibly ugly pandas solution. Basically, just creates a dictionary and maps almost every row in parents
one by one to children
to get the correct categories.
df = pd.DataFrame(json_data['response']['category_list'])
parents = df[df['has_children']]
children = df[~df['has_children']].copy()
mapper = dict(parents[['category_id', 'name']].to_dict('split')['data'])
parent_relationship = dict(parents[['category_id','parent_category_id']].to_dict('split')['data'])
children['Category 1'] = parents['category_id'].map(parent_relationship).map(mapper).dropna().tolist()[0]
children['Category 2'] = children['parent_category_id'].map(mapper)
out = children.drop(['parent_category_id','has_children'], axis=1).rename({'category_id':'Id','name':'Category 3'}, axis=1)[['Id','Category 1','Category 2','Category 3']]
Output:
Id Category 1 Category 2 Category 3
2 100419 Women Clothes Socks & Stockings Others
3 100418 Women Clothes Socks & Stockings Pantyhose
4 100417 Women Clothes Socks & Stockings Socks