python: groupby 3 columns according to grouping of first two-CodePudding

Given this dataframe,

import pandas as pd
d = {'a': ['john', 'mary','john','john','mary','john'], 'b': [1,2,3,1,1,2],
     'c': [0.7, 0.3,0.9,0.4,1.0,0.2],'d': [1,0,0,1,0,1]}
df = pd.DataFrame(data=d)

The following line prints out how many times df['a']=john and df['a']=mary correspond to df['b']=1,2,3

print(df.groupby('a')['b'].value_counts())

What I want to do now is to print out how many times df['a']=john and df['a']=mary corresponds to df['d']=1 or =0 when df['b']=1,2,3. for instance, when df['a']=john and df['b']=1, df['d'] is always equal to 1, and when df['a']=john and df['b']=3, df['d']=0 etc...

The following line prints out all zeroes and I am not sure why:

print((df['d'])[(df.groupby('a')['b'].value_counts())])

CodePudding user response：

You can modify your code to accommodate multiple columns in groupby:

print(df.groupby(['a', 'b'])['d'].value_counts())
# a     b  d
# john  1  1    2
#       2  1    1
#       3  0    1
# mary  1  0    1
#       2  0    1

CodePudding user response：

Just do value_counts

out = df.value_counts(['a','b','d'])
a     b  d
john  1  1    2
      2  1    1
      3  0    1
mary  1  0    1
      2  0    1
dtype: int64