I have two data sets, NA and HG
len(NA)=267
NA = [73,49,53...]
len(HG)=176
(HG is similar list like NA)
I want to draw the two data sets into one plot like this (I got this one by drawing them independently and then modify the plot by photoshop...which I can not do the same for another data set as they have different axis):
The seaborn adopts data in forms of numpy array and panda dataframe, which all requires the array set to be of equal length, which, in my case, does not stand, because HG has 176 data points and NA has 267.
Currently, what I did is transfer the list to pandas dataframe and then plot via
HG = sns.boxplot(data=HG, width = 0.2)
I tried HG = sns.boxplot(data=HG NA, width = 0.2)
and it returned me with an empty plot so...help please.
Thank you so much!
CodePudding user response:
I assume NA
and HG
are dataframes with one column since your plotting a box plot. So you can concat them into one df and then plot, there will be NaN for the large df but seaborn will ignore those
df = pd.concat([NA, HG], axis=1)
sns.plot(data=df)
CodePudding user response:
The following creates a DataFrame where the missing HG values are just filled with NaNs. These are ignored by the boxplot. There are many ways to generate a DataFrame with the length of the longest list. An alternative to the join shown below would be itertools.zip_longest
or pd.concat
as suggested by @Kenan.
NA = [73, 49, 53, 20, 20, 20, 20, 20, 20, 20, 20, 20]
HG = [73, 30, 60]
df = pd.Series(NA, name="NA").to_frame().join(pd.Series(HG, name="HG"))
sns.boxplot(data=df, width = 0.2)
Or maybe you are interested in using Holoviews, which gives you fully interactive plots in a very simple manner (when used in Jupyter with a bokeh backend). For your case that would look like the following:
import holoviews as hv
NA = [73, 49, 53, 20, 20, 20, 20, 20, 20, 20, 20, 20]
HG = [73, 30, 60]
hv.BoxWhisker(NA, label="NA") * hv.BoxWhisker(HG, label="HG")