I have the following dataset.
col1 col2 col3
a 1 yes
a 1 no
b 1 no
a 3 yes
c 1 yes
b 2 yes
I have used the crosstab to create a table between col1 and col2 and count the observation.
pd.crosstab(df.col1, df.col2)
output:
col2 1 2 3
col1
a 2 0 1
b 1 1 0
c 1 0 0
If i want the same table for groupby col3, how will i do that?
Desired output:
col3: Yes col3: No
col2 1 2 3 col2 1 2 3
col1 col1
a 1 0 1 a 1 0 0
b 0 1 0 b 1 0 0
c 1 0 0 c 0 0 0
Moreover, Is there any way to visualize the table more presentable?
CodePudding user response:
You can pass a list of columns to pd.crosstab
:
x = pd.crosstab(df.col1, [df.col3, df.col2])
idx = pd.MultiIndex.from_product(
[
x.columns.get_level_values(0).unique(),
x.columns.get_level_values(1).unique(),
]
)
x = x.reindex(idx, axis=1, fill_value=0)
print(x)
Prints:
col3 no yes
col2 1 2 3 1 2 3
col1
a 1 0 0 1 0 1
b 1 0 0 0 1 0
c 0 0 0 1 0 0
CodePudding user response:
If convert col2
to Categorical
and then DataFrame.pivot_table
add all categories to both levels:
df['col2'] = pd.Categorical(df['col2'])
df = df.pivot_table(index='col1',columns=['col3','col2'], aggfunc='size')
print (df)
col3 no yes
col2 1 2 3 1 2 3
col1
a 1 0 0 1 0 1
b 1 0 0 0 1 0
c 0 0 0 1 0 0