I have a huge list of items and I need to find a way to regroup those items that have repeated values. I need to find dictionaries in a series list that have the same title
and year
values, and create a new list for episode
and Cast
in which only one dictionary is kept. and if there is no matching i return the dict with list too.
So all dicts which have the same value for name, year should have a single dict and put their different values in a list cast-list
and episode-list
and to keep also the dict with no duplication.
I have tried a lot of things, I used two nested for loop, Filter..., but I was not able to do it!
If anyone can help one this, I really appreciate it.
[
{
"class": "TV Shows With Five or More Seasons",
"location":"usa",
"series": [
{
"title": "Mad Men",
"year": "2015",
"episode":10,
"Cast" :"Elisabeth Moss",
},
{
"title": "Mad Men",
"year": "2015",
"episode":14,
"Cast" :"January Jones",
},
{
"title": "Mad Men vostfr",
"year": "2017",
"episode":20,
"Cast" :"Jon Hamm",
"Type" :"Drama"
}
],
"producer": "Matthew Weine",
}
]
I want to group it like this as output:
[
{
"class": "TV Shows With Five or More Seasons",
"location":"usa",
"series": [
{
"title": "Mad Men",
"year": "2015",
"episode-list":[10,14],
"Cast-list" :["Elisabeth Moss","January Jones"],
},
{
"title": "Mad Men vostfr",
"year": "2017",
"episode-list":[20],
"Cast-list" :["Jon Hamm"],
"Type" :"Drama"
}
],
"producer": "Matthew Weine",
}
]
Note: I have to keep the Type
that exisit in only on dict!
CodePudding user response:
Try:
lst = [
{
"class": "TV Shows With Five or More Seasons",
"location": "usa",
"series": [
{
"title": "Mad Men",
"year": "2015",
"episode": 10,
"Cast": "Elisabeth Moss",
},
{
"title": "Mad Men",
"year": "2015",
"episode": 14,
"Cast": "January Jones",
},
{
"title": "Mad Men vostfr",
"year": "2017",
"episode": 20,
"Cast": "Jon Hamm",
"Type": "Drama",
},
],
"producer": "Matthew Weine",
}
]
for d in lst:
tmp = {}
for s in d["series"]:
tmp.setdefault((s["title"], s["year"]), []).append(s)
d["series"] = []
for (title, year), v in tmp.items():
d["series"].append(
{
"title": title,
"year": year,
"episode": [s["episode"] for s in v],
"Cast": [s["Cast"] for s in v],
"Type": [s["Type"] for s in v if "Type" in s],
}
)
if d["series"][-1]["Type"]:
d["series"][-1]["Type"] = d["series"][-1]["Type"][0]
else:
del d["series"][-1]["Type"]
print(lst)
Prints:
[
{
"class": "TV Shows With Five or More Seasons",
"location": "usa",
"series": [
{
"title": "Mad Men",
"year": "2015",
"episode": [10, 14],
"Cast": ["Elisabeth Moss", "January Jones"],
},
{
"title": "Mad Men vostfr",
"year": "2017",
"episode": [20],
"Cast": ["Jon Hamm"],
"Type": "Drama",
},
],
"producer": "Matthew Weine",
}
]