Home > Enterprise >  Plotting two data frames in one bar graph in python pandas
Plotting two data frames in one bar graph in python pandas

Time:09-15

I have two data frames as an example below:


df1 = pd.DataFrame({'URL': ['A','B','C'],
                    'X': [1,0,1],
                    'Y': [0,1,0],
                    'Z':[1,0,0]})


df2 = pd.DataFrame({'URL': ['D','E','F'],
                    'X': [0,1,0],
                    'Y': [1,0,0],
                    'Z':[1,0,0]})

I plotted a bar graph for the first data frame using the following code:

df1.melt("URL").\
    groupby("variable").\
    agg(Non_Tracking_websites=("value", lambda x: sum(x != 0))).\
plot(kind="bar")

enter image description here

And for the second data frame I used the same scenario and plotted the graph as below:

df2.melt("URL").\
    groupby("variable").\
    agg(Tracking_websites=("value", lambda x: sum(x != 0))).\
plot(kind="bar")

enter image description here

Now I want to merge the two bar graphs keeping the same scenarios as above. The final bar graph should look like this:

enter image description here

If anyone can help me, It would be great. Thank you

CodePudding user response:

You'll need to merge the two aggregated dataframe first

Python 3.10.4 (main, Mar 31 2022, 08:41:55) [GCC 7.5.0]
Type 'copyright', 'credits' or 'license' for more information
IPython 8.3.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: import pandas as pd
   ...: 
   ...: 
   ...: df1 = pd.DataFrame({'URL': ['A','B','C'],
   ...:                     'X': [1,0,1],
   ...:                     'Y': [0,1,0],
   ...:                     'Z':[1,0,0]})
   ...: 
   ...: 
   ...: df2 = pd.DataFrame({'URL': ['D','E','F'],
   ...:                     'X': [0,1,0],
   ...:                     'Y': [1,0,0],
   ...:                     'Z':[1,0,0]})

In [2]: agg_df1 = df1.melt("URL").\
   ...:     groupby("variable").\
   ...:     agg(Non_Tracking_websites=("value", lambda x: sum(x != 0)))

In [3]: agg_df2 = df2.melt("URL").\
   ...:     groupby("variable").\
   ...:     agg(Tracking_websites=("value", lambda x: sum(x != 0)))

In [4]: agg_df1.merge(agg_df2, left_index=True, right_index=True).plot(kind='bar
   ...: ')

enter image description here

CodePudding user response:

IIUC, use:

(pd.concat({'non tracking': df1, 'tracking': df2}, names=['type'])
   .reset_index('type')
   .melt(['type', 'URL'])
   .groupby(['type', 'variable'])['value'].sum()
   .unstack('type')
   .plot.bar()
)

output:

bar plot pandas

  • Related