How to count values in Python when two of the columns are greater than 1?-CodePudding

I have a data frame that looks something like this:

ID	Month	P1	P2	...	Sales
1	Jan	2	0	...	2
1	Feb	3	1	...	0
1	Mar	3	0	...	0
2	Jan	19	1	...	20
2	Feb	8	0	...	0
2	Mar	12	0	...	1

I want to create another dataframe that will count monthly IDs where Sales and other column (P1,P2, ...Pn) both are greater than 0

For eg. In the above dataframe in Jan there were 2 Ids 1 and 2 which had P1>0 and Sales >0 In Feb there are 0 Ids which had P1>0 & Sales>0 (P1 is greater than 0 for both IDs but Sales is 0) In Mar there is 1 Id 2 which had P1>0 & Sales>0

I want to see result something like this:

Month	P1	P2	...
Jan	2	1	...
Feb	0	0	...
March	1	0	...

I was thinking of creating a dataframe for P1,P2,... Pn using a for loop that counts a number of instances. But this will be a time-consuming process.

Can you please suggest me an alternative solution?

CodePudding user response：

Use a custom aggregation with groupby.agg:

out = (df.filter(like='P').where(df['Sales'].gt(0)).gt(0)
         .groupby(df['Month'], sort=False).sum()
         .reset_index()
       )

Or from an explicit list of columns:

cols = ['P1', 'P2']
out = (df[cols].where(df['Sales'].gt(0)).gt(0)
         .groupby(df['Month'], sort=False).sum()
         .reset_index()
       )

Output:

  Month  P1  P2
0   Jan   2   1
1   Feb   0   0
2   Mar   1   0

CodePudding user response：

def fct(d):
    return d.groupby("ID")[["P1", "P2"]].agg(lambda x: ((x > 0) & (d["Sales"] > 0)).any()).apply(sum)

result = d.groupby("Month").apply(fct)

CodePudding user response：

Another option is to specify/denote the conditions explicitly:

pd.DataFrame({'Month': df.Month, 'P1': df.P1 * df.Sales > 0, 'P2': df.P2 * df.Sales > 0})\
 .groupby('Month', as_index=False, sort=False).sum()

  Month  P1  P2
0  Jan    2   1
1  Feb    0   0
2  Mar    1   0

ID	Month	P1	P2	...	Sales
1	Jan	2	0	...	2
1	Feb	3	1	...	0
1	Mar	3	0	...	0
2	Jan	19	1	...	20
2	Feb	8	0	...	0
2	Mar	12	0	...	1

ID	Month	P1	P2	...	Sales
1	Jan	2	0	...	2
1	Feb	3	1	...	0
1	Mar	3	0	...	0
2	Jan	19	1	...	20
2	Feb	8	0	...	0
2	Mar	12	0	...	1

ID	Month	P1	P2	...	Sales
1	Jan	2	0	...	2
1	Feb	3	1	...	0
1	Mar	3	0	...	0
2	Jan	19	1	...	20
2	Feb	8	0	...	0
2	Mar	12	0	...	1