Home > OS >  Python list: print unique entries based on value
Python list: print unique entries based on value

Time:04-12

I'm making a script which fetches movies and shows from different services. I need a functionality where if a movie is available on a platform (e.g paramount) in both 4k and hd, then I want to only show 4k result. This is what I've tried so far.


offers = [ # fetches from server, this is hardcoded for the sake of this question
    {
        "monetization_type": "flatrate",
        "package_short_name": "pmp",
        "presentation_type": "4k",
    },
    {
        "monetization_type": "flatrate",
        "package_short_name": "pmp",
        "presentation_type": "hd",
    },
    {
        "monetization_type": "flatrate",
        "package_short_name": "fxn",
        "presentation_type": "hd",
    }
]
    
result = []

def streaminfo():
    for i in range(len(offers)):
        monetize = offers[i]["monetization_type"]
        service = offers[i]["package_short_name"]  # short name
        qty = offers[i]["presentation_type"]
        if monetize == 'flatrate' and qty in ['4k', 'hd']:
            result.append(f"Stream {[i]}: US - {service} - {qty}")
    return "\n".join(result)

print(streaminfo())

Result:

    Stream [0]: US - pmp - 4k # ParamountPlus
    Stream [1]: US - pmp - hd # ParamountPlus
    Stream [2]: US - fxn - hd # FoxNow

Now 4k and hd are available on ParamountPlus but I only want to print 4k.

CodePudding user response:

What if you create a dictionary that rates the qualities? Could be good if you later have streams that are SD or other formats. That way you are always showing only the best quality stream from each service with minimal code:

qty_ratings = {
    '4k': 1,
    'hd': 2,
    'sd': 3
}

Append the highest quality stream:

if monetize == 'flatrate':  
            # Check if there are more than one stream from the same service
            if len(qty) > 1:
                qty = min([qty_ratings[x] for x in qty])
            result.append(f"Stream {[i]}: US - {service} - {qty}")

    return "\n".join(result)

CodePudding user response:

Personally I would explicitly sort the streams, not because it's more efficient, but because it makes it clear what I'm doing.

Assuming your offers are defined as in the question:

ranking = ("sd", "hd", "4k")

def get_best_stream(streams):
    return max(streams, key=lambda x: ranking.index(x["presentation_type"]))


get_best_stream(offers)

As @Gustaf notes, this is forward compatible. I've used a tuple rather than a dict since we really only care about order (perhaps even more explicit would be an enum).

If you want to keep one offer from every source, I would encode this explicitly:

def get_streams(streams):
    sources = set(x["package_short_name"] for x in streams)
    return [
        get_best_stream(s for s in streams if s["package_short_name"] == source)
        for source in sources
    ]



get_streams(offers)

Of course if you have billions of streams it would be more efficient to build a mapping between source and offers, but for a few dozen the cost of an iterator is trivial.

  • Related