Right now I have a data-frame which stores multiple y-values measured over a range of wavelengths (x-values), like below:
x y0 y1 ... y12 y13 y14
659 335.072723 -0.003132 -0.002903 ... 0.002981 0.007160 0.003514
660 335.285309 -0.001403 -0.002011 ... 0.003565 0.003635 0.002760
661 335.497925 -0.002544 0.002261 ... 0.007555 0.004373 0.011329
662 335.710510 -0.000546 -0.004526 ... 0.003469 0.004768 0.006191
663 335.923157 -0.011268 -0.001985 ... 0.001623 0.003106 0.006442
.. ... ... ... ... ... ... ...
771 358.970215 -0.000100 -0.000876 ... 0.001332 -0.000311 0.002851
772 359.184387 -0.000897 0.001286 ... -0.000037 0.000300 0.001410
773 359.398560 -0.001188 0.001162 ... -0.000901 0.000200 0.001739
774 359.612762 0.001268 -0.001201 ... -0.000545 0.001514 0.000689
775 359.826965 0.001439 -0.000671 ... 0.000295 0.002545 0.002769
I am able to create a histogram from the first column, using
plt.hist(megDiff['y0'], bins = 100, label = "y0 Data Set")
However, I would like to create a single histogram from all of the y-value data sets. I am not quite sure how to do this, if anyone knows I would greatly appreciate some help
CodePudding user response:
You can combine (concat) all the y* columns into one then do the plot.
import pandas as pd
import matplotlib.pyplot as plt
import itertools as it
df = pd.DataFrame({'x': [1, 2, 3],
'y1': [11, 23, 31],
'y2': [12, 22, 32],
'y3': [13, 21, 33],})
plt.hist(sorted(it.chain(*df.loc[:, df.columns != 'A'].values)), bins = 100, label = "y0 Data Set")
Sample result:
CodePudding user response:
If you want a different label for each column, you can directly pass all the columns that interest you to the hist
function.
Here is an example which plots every column except the first (x):
plt.hist(df[df.columns[1:]], bins = 100)
plt.legend(df.columns[1:])
plt.show()
Maybe this link can help you : https://matplotlib.org/stable/gallery/statistics/histogram_multihist.html.