Home > Software design >  Pandas / Matplotlib bar plot with multi index dataframe
Pandas / Matplotlib bar plot with multi index dataframe

Time:01-15

I have a sorted Multi-Index pandas data frame, which I need to plot in a bar chart. grouped bar plot

CodePudding user response:

ChatGPT has answered my question

import pandas as pd
import matplotlib.pyplot as plt

# create a dictionary of data for the DataFrame
data = {
    'app_name': ['Google Maps', 'Uber', 'Waze', 'Spotify', 'Pandora'],
    'category': ['Navigation', 'Transportation', 'Navigation', 'Music', 'Music'],
    'rating': [4.5, 4.0, 4.5, 4.5, 4.0],
    'reviews': [1000000, 50000, 100000, 500000, 250000]
}

# create the DataFrame
df = pd.DataFrame(data)

# set the 'app_name' and 'category' columns as the index
df = df.set_index(['app_name', 'category'])

# add a new column called "content_rating" to the DataFrame, and assign a content rating to each app
df['content_rating'] = ['Everyone', 'Teen', 'Everyone', 'Everyone', 'Teen']

# Grouping the Data by category and content_rating and getting the mean of reviews
df_grouped = df.groupby(['category','content_rating']).agg({'reviews':'mean'})

# Reset the index to make it easier to plot
df_grouped = df_grouped.reset_index()

# Plotting the stacked bar chart
df_grouped.pivot(index='category', columns='content_rating', values='reviews').plot(kind='bar', stacked=True)

This is a sample data set

What I did is I added a sum column to the dataset and sorted it by this sum.

piv = qw1.reset_index()
piv = piv.pivot_table(index='Category', columns='Content', values='per')#.plot(kind='bar', stacked = True)
piv["Sum"] = piv.sum(axis=1)
piv_10 = piv.sort_values(by = "Sum", ascending = False)[["Adult", "Everyone", "Mature", "Teen"]].head(10)

where qw1 is the multi-index data frame.

Then all had to do is to plot it:

piv_10.plot.bar(stacked = True, logy = False)
  • Related