Home > Blockchain >  Seaborn multiple boxplot based by different conditions
Seaborn multiple boxplot based by different conditions

Time:10-07

I have a dataframe with two columns. The power column represents the power consumption of the system. And the component_status column divide the data in two, based when the component is OFF or ON. When the values are 153 is when the component is ON and when the values are 150 the component is OFF.

The result that I am looking for is to have a boxplot with three boxplots, using sns.boxplot. One is the power consumption with all the data, called "TOTAL". The other two, the power consumption based if the component was OFF or ON, called "COMPONENT = ON" "COMPONENT = OFF".

The data frame example is as follows:

power|component_status |
 0.5 |       150       | 
 1.5 |       150       | 
 2.5 |       150       |
 0.3 |       153       |
 0.5 |       153       | 
 1.5 |       153       | 
 2.5 |       150       |
 0.3 |       153       |

thanks for the help.

CodePudding user response:

Your first step is to build your data frame with the conditions. There are a few ways to go about this.

  1. Let's start with an initial df1 (dataframe #1) as you have given. Then, let's add a condition column to say "Total". You can use print(df1) to see what this looks like.

  2. Then let's copy that dataframe into df2, and let's replace the conditions with the off/on criteria from the component_status.

  3. Our final dataframe df is just a concatenation of df1 and df2.

  4. Now we have a dataframe df that is ready to go in Seaborn.

    ### Set up
    import pandas as pd
    import numpy as np
    import seaborn as sns
    
    power = [0.5, 1.5, 2.5, 0.3, 0.5, 1.5, 2.5, 0.3]
    component_status = [150, 150, 150, 153, 153, 153, 150, 153]
    df1 = pd.DataFrame(
        data=zip(power, component_status), columns=["power", "component_status"]
    )
    
    ### Step 1
    df1["condition"] = "Total"
    # print(df1)
    
    ### Step 2
    df2 = df1.copy()
    
    df2["condition"] = np.where(df2["component_status"] == 153, "On", "Off")
    
    ### If you have several criteria, it can be easier to use np.select
    ### ... or just use Pandas directly:
    # df2.loc[(df2['component_status'] == 153), 'condition'] = 'On'
    # df2.loc[(df2['component_status'] == 150), 'condition'] = 'Off'
    
    ### Step 3
    df = pd.concat([df1,df2])
    print(df)
    
    ### Step 4
    sns.boxplot(data=df, x='condition', y='power')
    

enter image description here enter image description here

  • Related