Home > Software engineering >  How to create a table using a crosstab and then group by another column value
How to create a table using a crosstab and then group by another column value

Time:11-14

I have the following dataset.

col1    col2    col3
a       1       yes
a       1       no
b       1       no
a       3       yes
c       1       yes
b       2       yes

I have used the crosstab to create a table between col1 and col2 and count the observation.

pd.crosstab(df.col1, df.col2)

output:

col2    1   2   3

col1 
a       2   0   1
b       1   1   0
c       1   0   0

If i want the same table for groupby col3, how will i do that?

Desired output:

col3: Yes                           col3: No                
col2    1   2   3                   col2    1   2   3

col1                                col1
a       1   0   1                   a       1   0   0
b       0   1   0                   b       1   0   0
c       1   0   0                   c       0   0   0 

Moreover, Is there any way to visualize the table more presentable?

CodePudding user response:

You can pass a list of columns to pd.crosstab:

x = pd.crosstab(df.col1, [df.col3, df.col2])

idx = pd.MultiIndex.from_product(
    [
        x.columns.get_level_values(0).unique(),
        x.columns.get_level_values(1).unique(),
    ]
)

x = x.reindex(idx, axis=1, fill_value=0)
print(x)

Prints:

col3 no       yes      
col2  1  2  3   1  2  3
col1                   
a     1  0  0   1  0  1
b     1  0  0   0  1  0
c     0  0  0   1  0  0

CodePudding user response:

If convert col2 to Categorical and then DataFrame.pivot_table add all categories to both levels:

df['col2'] = pd.Categorical(df['col2'])

df = df.pivot_table(index='col1',columns=['col3','col2'], aggfunc='size')
print (df)
col3 no       yes      
col2  1  2  3   1  2  3
col1                   
a     1  0  0   1  0  1
b     1  0  0   0  1  0
c     0  0  0   1  0  0
  • Related