Home > Enterprise >  bar plot a multiheader dataframe in a desired format
bar plot a multiheader dataframe in a desired format

Time:02-04

I have the following DataFrame

data = {('Case1', 'A'): {'One': 0.96396415, 'Two': 0.832049574, 'Three': 0.636568627, 'Four': 0.765846157},
        ('Case1', 'B'): {'One': 0.257496625, 'Two': 0.984418254, 'Three': 0.018891398, 'Four': 0.440278509},
        ('Case1', 'C'): {'One': 0.512732941, 'Two': 0.622697929, 'Three': 0.731555346, 'Four': 0.031419349},
        ('Case2', 'A'): {'One': 0.736783294, 'Two': 0.460765675, 'Three': 0.078558864, 'Four': 0.566186283},
        ('Case2', 'B'): {'One': 0.921473211, 'Two': 0.274749932, 'Three': 0.312766018, 'Four': 0.159229808},
        ('Case2', 'C'): {'One': 0.146389032, 'Two': 0.893299471, 'Three': 0.536288712, 'Four': 0.775763286},
        ('Case3', 'A'): {'One': 0.351607026, 'Two': 0.041402396, 'Three': 0.924265706, 'Four': 0.639154727},
        ('Case3', 'B'): {'One': 0.966538215, 'Two': 0.658236148, 'Three': 0.473447279, 'Four': 0.545974617},
        ('Case3', 'C'): {'One': 0.036585457, 'Two': 0.279443317, 'Three': 0.407991168, 'Four': 0.101083315}}

pd.DataFrame(data=data)
Case1 Case1 Case1 Case2 Case2 Case2 Case3 Case3 Case3
A B C A B C A B C
One 0,963964 0,257497 0,512733 0,736783 0,921473 0,146389 0,351607 0,966538 0,036585
Two 0,83205 0,984418 0,622698 0,460766 0,27475 0,893299 0,041402 0,658236 0,279443
Three 0,636569 0,018891 0,731555 0,078559 0,312766 0,536289 0,924266 0,473447 0,407991
Four 0,765846 0,440279 0,031419 0,566186 0,15923 0,775763 0,639155 0,545975 0,101083

There are 2 header rows.

In the end i need a plot like the following (which i created in excel). Another solution would be a seperate plot for every Case, instead of all in one.

enter image description here

What i tried so far is:

df.T.melt(ignore_index=False)

to get the DataFrame in a format like i used in excel. But from there i could not figure any solution to get the right plot. Maybe the transpose/melt is not even necessary.

Can anyone give me a hint on how to achieve the desired plot?

Thank you!

CodePudding user response:

Make sure to reshape your dataframe first so the two levels of the x-axis be as a MultiIndex.

df = df.T.stack().unstack(level=1)

Then make the plot this way (highly inspired by @gyx-hh enter image description here

CodePudding user response:

Given your data, if it is in Excel, one would read it like this:

import pandas as pd
df = pd.read_excel('values.xlsx', index_col=0, header=[0,1])

To plot something very similar to what you would like:

long_df = df.T.stack().reset_index()
long_df.columns = ['cases', 'subcases', 'observations', 'value']
transformed = long_df.pivot_table(
    index=['cases', 'observations'],
    columns='subcases',
    values='value',
    sort=False)
transformed.plot(kind='bar')

view of the created plot

Explaining how it works:

  1. long_df is a df normalized to a 'long' format using stack function.
  2. pivot data to get it in a correct orientation for simple plotting
  3. plot in matplotlib using .plot method and bar option.
  • Related