I am analysing data which is organised as following:
- There are 4 different pandas data fram for each groups (A, B and C).
- Each dataframe representing a group has 4 subroups (columns) and rows representing thoer corresponding observations.
For example, a single group of data looks like:
subgroup-1 | subgroup-2 | subgroup-3 | subgroup-4 |
---|---|---|---|
12 | 4 | NaN | 9 |
15 | 3 | 4 | NaN |
16 | 8 | 3 | 11 |
17 | 12 | 8 | 13 |
11 | 17 | 12 | 14 |
I want to visualise the distributions for each subgroup for the different group. Can anyone let me know what are the available options in Python to do this (the chart types I can use). Thanks.
I tried using histogram, density plots but all of them work only for 2 variables.
CodePudding user response:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
# pandas Dataframes
group_A = pd.DataFrame(np.random.rand(50, 4) , columns=['subgroup-1' , 'subgroup-2' , 'subgroup-3' , 'subgroup-4'])
group_B = pd.DataFrame(np.random.rand(50, 4) , columns=['subgroup-1' , 'subgroup-2' , 'subgroup-3' , 'subgroup-4'])
group_C = pd.DataFrame(np.random.rand(50, 4) , columns=['subgroup-1' , 'subgroup-2' , 'subgroup-3' , 'subgroup-4'])
def plot_hist(subgroup):
np.random.seed(19680801)
n_bins = 10
x = np.dstack([group_A[subgroup] , group_B[subgroup] , group_C[subgroup]])[0]
fig, axes = plt.subplots(nrows=2, ncols=2)
ax0, ax1, ax2, ax3 = axes.flatten()
ax0.hist(x, n_bins, density=True, histtype='bar', label = ['A', 'B', 'C'])
ax0.legend(prop={'size': 10})
ax0.set_title('bars with legend')
ax1.hist(x, n_bins, density=True, histtype='bar', stacked=True)
ax1.set_title('stacked bar')
ax2.hist(x, n_bins, histtype='step', stacked=True, fill=False)
ax2.set_title('stack step (unfilled)')
# Make a multiple-histogram of data-sets with different length.
x_multi = [np.random.randn(n) for n in [10000, 5000, 2000]]
ax3.hist(x_multi, n_bins, histtype='bar')
ax3.set_title('different sample sizes')
fig.tight_layout()
plt.show()
plot_hist('subgroup-1')