Home > other >  show only elements with the most recurrence On seaborn
show only elements with the most recurrence On seaborn

Time:03-31

Good afternoon everyone, i am a beginner in python i hope someone can help me ! I have in a dataframe a list of netflix movies and the number of votes that each of those movies receive. For example :

Title : The100 Votes : 1500
Title : Marania Votes : 2000 

My question is simple :

Using seaborn and matplotlib i want to print in a figure the 5 movies which receive the the highest number of votes (with their number of votes).

What i try :

import seaborn as sns

...

sns.catplot(x='title', y='votes_number', data=top5_series)

But i don't really understand how i can only print the "5 best".

Thank you by advance !

CodePudding user response:

You can do everything with pandas

import pandas as pd
import numpy as np
import string

np.random.seed(1)
df = pd.DataFrame(
    {
        "movie": [i for i in string.ascii_uppercase],
        "vote": np.random.randint(low=10, high=500, size=len(string.ascii_uppercase))
    }
)

# If you want a different number change the n_most
n_most = 5
df.nlargest(n_most , ["vote"]).plot(kind="bar", x="movie", y="vote", figsize=(15,6), rot=45)

output

CodePudding user response:

I'm going to use one of the sample datasets from Seaborn here, as I don't have yours. This alternative sorts the dataframe and then plots only a subset of it.

I've pulled bits out of the plotting code, defining variables before plotting to make it easier to read, but the values could also be substituted in the the sns.catplot() function.

import seaborn as sns
sns.set_theme(style="whitegrid")

# Get data
penguins = sns.load_dataset("penguins")

# Sort data so I can select top values with a subset
# `ascending` is set to `False` because `NaN` values get put at the end
penguins.sort_values("bill_length_mm", ascending=False, inplace=True)


# Pull out only the first five rows
subset = penguins.iloc[ :5, :]

# Get index values to use for `x`, so it doesn't group them
# as it would if `x='species'` or `island`

idx = penguins.index.to_list()[:5]

# Plot
g = sns.catplot(
    data=subset, kind="bar",
    x=idx, y="body_mass_g",
)
  • Related