Home > other >  Pandas groupby finding sign pattern in multiple columns
Pandas groupby finding sign pattern in multiple columns

Time:05-19

I have a large dataframe df:

Col1    Col2    Col3    Val1    Val2
A1      B1      c1     -0.2     0
A1      B1      c2     -0.3     0.3
A1      B1      c3     -0.5     0.2
A2      B2      c1     -0.3     0.1
A2      B2      c2      0.7    -0.3
A3      B3      c1     -0.3     0.3

Desired outcome:

Col1    Col2    Col3    Val1    Val2  Pattern
A1      B1      c1     -0.2     0     Y
A1      B1      c2     -0.3     0.3   Y
A1      B1      c3     -0.5     0.2   Y
A2      B2      c1     -0.3     0.1   N
A2      B2      c2      0.7    -0.3   N
A3      B3      c1     -0.3     0.3   Y

Where for each group of Col1 and Col2, I want to flag pattern as Y if each (Val1, Val2) has sign pattern of (negative, greater than equal 0)

I am trying something of the following sort:

signs = np.sign(df[['Val1', 'Val2']])
m1 = signs.eq([-1, 1]).all(axis=1)
df['Pattern'] = m1.groupby([df['Col1'], df['Col2']])\
                  .transform('all').all(axis=1)\
                  .map({True:'Y', False: 'N'})

However, above code clearly will not get to what I need. Essentially m1 above is not properly framed.

CodePudding user response:

Use groupby np.where

df['Pattern'] = np.where(df.groupby('Col1').apply(lambda x:(x['Val1'].lt(0))&(x['Val2'].ge(0)).all()),'Y','N')

CodePudding user response:

Problem comes from numpy interpretation of sign.

With your own function but replacing numpy...

df["Pattern"]=((df.Val1<0) & (df.Val2>=0))\
    .groupby([df['Col1'], df['Col2']])\
    .transform('all')\
    .map({True:"Y",False:"N"})

>>> df
  Col1 Col2 Col3  Val1  Val2 Pattern
0   A1   B1   c1  -0.2   0.0       Y
1   A1   B1   c2  -0.3   0.3       Y
2   A1   B1   c3  -0.5   0.2       Y
3   A2   B2   c1  -0.3   0.1       N
4   A2   B2   c2   0.7  -0.3       N
5   A3   B3   c1  -0.3   0.3       Y
  • Related