I have a list of categories objects like the list below, the problem with this list, is that i have categories and brands mixed, and i only need to get the brands from this list.
I know which ones are the brands, because if i navigate in the parentCategoryIds, i will got the root parent (which is id: brands, parentCategoryId: None)
categories = [
#brands
{ "id": "brands", "parentCategoryId": None },
{ "id": "ls", "parentCategoryId": "brands" },
{ "id": "bleed", "parentCategoryId": "brands" },
{ "id": "shape", "parentCategoryId": "brands" },
{ "id": "graze", "parentCategoryId": "brands" },
{ "id": "item", "parentCategoryId": "brands" },
{ "id": "install", "parentCategoryId": "brands" },
{ "id": "horror", "parentCategoryId": "brands" },
{ "id": "thanks", "parentCategoryId": "brands" },
{ "id": "scrape", "parentCategoryId": "brands" },
{ "id": "shelter", "parentCategoryId": "brands" },
{ "id": "dynamic", "parentCategoryId": "brands" },
{ "id": "under", "parentCategoryId": "shape" },
{ "id": "right", "parentCategoryId": "shape" },
{ "id": "base", "parentCategoryId": "shape" },
{ "id": "scrap", "parentCategoryId": "shape" },
# categories
{ "id": "root", "parentCategoryId": None },
{ "id": "bark", "parentCategoryId": "rich" },
{ "id": "rich", "parentCategoryId": "sting" },
{ "id": "rich", "parentCategoryId": "sting" },
{ "id": "sting", "parentCategoryId": "root" },
]
To solve this issue, i wrote the function below, but i think it will be very slow, since this list is only an example, the original list has hundred of records. In this function, i'm navigating through the parents until i find the root (if the root == brand, i know i have a brand, so i can add to a separated list, if not, i just ignore). I'd would like to know if i can do it better, so if i pass a bigger list, it would not be a problem.
brands = []
def getParent(id):
for obj in categories:
if obj['id'] == id:
return obj
def is_brand(obj):
if obj['id'] == 'brands' and obj['parentCategoryId'] == None:
return True
if obj['id'] == 'root':
return False
if not obj['parentCategoryId'] == None:
return is_brand(getParent(obj['parentCategoryId']))
for obj in categories:
if is_brand(obj):
brands.append(obj)
print(brands)
CodePudding user response:
This has a better time complexity O(n * m) where m is the maximum depth of this graph, yours is O(n^2*m) if I calculated correctly.
def get_brands():
brands = []
lookup_categories = dict([(category['id'], category['parentCategoryId']) for category in categories])
for id, parent in lookup_categories.items():
if id == 'brands':
brands.append(id)
continue
while parent is not None and parent != 'brands':
parent = lookup_categories[parent]
if parent:
brands.append(id)
return brands
Another approach I was thinking about is using a disjoint set but I couldn't find a builtin library that has that data structure in python.