Home > Enterprise >  Overlaying probability density functions on one plot
Overlaying probability density functions on one plot

Time:04-22

I would like to create a probability density function for the isotopic measurements of N from three NOx sources. The number of measurements varies between sources, so I've created three dataframes. Here is the code:

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
#import matplotlib.ticker as plticker
#from matplotlib.ticker import (MultipleLocator, AutoMinorLocator)


df = pd.DataFrame({
    'Mobile':[15.6, 14.2, 14.4, 10.2, 13.1, 12.8, 13.3, 16.9, 15.8, 15.3, 16.9, 15.6, 15.6, 17, 16, 15.1, 15, 14.4,
              14.6, 16.2, 15.3, 16.4, -0.4, -2.9, 1.6, 9.8, 1.6, -8.1, -4.4, -0.4, 8.6]})
    
df1 = pd.DataFrame({
    'Soil':[-47, -37, -29, -26, -25, -24, -31, -23, -22, -19, -49, -42, -44, -37, -29, -29, -32, -31, -29, -28,
            -26.5, -30.8]})
df2 = pd.DataFrame({
    'Biomass Burning':[-2.7, -5, -5.9, -7.2, 3.2, 2.6, 3.8, 8.1, 12, 0.9, 1.3, 1.6, -1.5, -1.3, -0.1, 0.5, 4.4, 2,
                       2.9, 1.7, 3.2, 1.6, -0.3, -0.9]})

fig = plt.figure()
ax = fig.add_subplot()
ax.hist([df, df1, df2], label = ("Mobile", "Soil", "Biomass Burning"), bins=25, stacked=True, range=[0,25])

The problem is that I get an error message that says: ValueError: x must have 2 or fewer dimensions. I've tried a "fatten" method but get an error message that says AttributeError: 'DataFrame' object has no attribute 'flatten'. I am unsure of what to try next to get the code to run and could use some help. I am also thinking that hist might be the wrong function to use since I want a probability density distribution. I've also tried:

sns.displot(data=[df,df1,df2], x=['Mobile','Soil','Biomass Burning'], hue='target', kind='kde', 
            fill=True, palette=sns.color_palette('bright')[:3], height=5, aspect=1.5)

But again, I run into the issue of the dataframes being different lengths. Thanks!

CodePudding user response:

One option is to melt the dataframes, concat them, and then use hue with displot:

data = pd.concat([df.melt(), df1.melt(), df2.melt()], ignore_index=True)
sns.displot(data=data, x='value', hue='variable', kind='kde')

Output:

enter image description here

Use the var_name and value_name parameters of enter image description here

  • Related