I am working on a survey and the data looks like this:
ID Q1 Q2 Q3 Gender Age Dept
001 Y N Y F 22 IT
002 N Y Y M 35 HR
003 Y N N F 20 IT
004 Y N Y M 54 OPRE
005 Y N Y M 42 OPRE
So I created a pivot table like this:
Q1 #Respondents %Res
Y 4 80
N 1 20
If I would like to slice it by Gender, then it should be like:
Q1 #Res %Rep
M F M F
Y 2 2 50 50
N 1 0 100 0
And if I want this to be applied to all the questions, I'd like to use Altiar which enables me to choose the question so that I don't need to execute the codes all the time. So far, I only know how to create simple table by:
Q1 = pd.pivot_table(df,values = ['ID'],
index = ["Q1"],
aggfunc ={'ID': 'count', })
Q1['%Respondents'] = (Q1['ID']/Q1['ID'].sum())*100
Q1
I don't know how to break it by gender and apply Altair. Please let me know if you could help! Thanks!
CodePudding user response:
To break by gender. Just utilized it as index, and then unstack it
Q1 = pd.pivot_table(df,values = ['ID'],
index = ["Q1","Gender"],
aggfunc ={'ID': 'count', }).unstack(level = 0)
Unfurtunataly cant help you with you graph
CodePudding user response:
IIUC, you can pivot_table
then add the result of division as a new foo column %Respondents
out = df.pivot_table(index='Q1', columns=['Gender'], values=['ID'], aggfunc='count', fill_value=0)
out = (out.join(out[['ID']].div(out['ID'].sum(axis=1).values, axis=0)
.mul(100)
.rename(columns={'ID':'%Respondents'})))
print(out)
ID %Respondents
Gender F M F M
Q1
N 0 1 0.0 100.0
Y 2 2 50.0 50.0