I have a large dataframe df:
Col1 Col2 Col3 Val1 Val2
A1 B1 c1 -0.2 0
A1 B1 c2 -0.3 0.3
A1 B1 c3 -0.5 0.2
A2 B2 c1 -0.3 0.1
A2 B2 c2 0.7 -0.3
A3 B3 c1 -0.3 0.3
Desired outcome:
Col1 Col2 Col3 Val1 Val2 Pattern
A1 B1 c1 -0.2 0 Y
A1 B1 c2 -0.3 0.3 Y
A1 B1 c3 -0.5 0.2 Y
A2 B2 c1 -0.3 0.1 N
A2 B2 c2 0.7 -0.3 N
A3 B3 c1 -0.3 0.3 Y
Where for each group of Col1 and Col2, I want to flag pattern as Y
if each (Val1, Val2)
has sign pattern of (negative, greater than equal 0)
I am trying something of the following sort:
signs = np.sign(df[['Val1', 'Val2']])
m1 = signs.eq([-1, 1]).all(axis=1)
df['Pattern'] = m1.groupby([df['Col1'], df['Col2']])\
.transform('all').all(axis=1)\
.map({True:'Y', False: 'N'})
However, above code clearly will not get to what I need. Essentially m1
above is not properly framed.
CodePudding user response:
Use groupby np.where
df['Pattern'] = np.where(df.groupby('Col1').apply(lambda x:(x['Val1'].lt(0))&(x['Val2'].ge(0)).all()),'Y','N')
CodePudding user response:
Problem comes from numpy interpretation of sign.
With your own function but replacing numpy...
df["Pattern"]=((df.Val1<0) & (df.Val2>=0))\
.groupby([df['Col1'], df['Col2']])\
.transform('all')\
.map({True:"Y",False:"N"})
>>> df
Col1 Col2 Col3 Val1 Val2 Pattern
0 A1 B1 c1 -0.2 0.0 Y
1 A1 B1 c2 -0.3 0.3 Y
2 A1 B1 c3 -0.5 0.2 Y
3 A2 B2 c1 -0.3 0.1 N
4 A2 B2 c2 0.7 -0.3 N
5 A3 B3 c1 -0.3 0.3 Y