Home > database >  How to create a dummy only if a column has non-zero values for certain dates but zero for other date
How to create a dummy only if a column has non-zero values for certain dates but zero for other date

Time:05-07

Let's say, I want to identify traders who only traded during bull runs but did not trade (zero values) during downturns or stable periods. Let's say we have two bull runs, 2018Q4, 2021Q4. Below, D starts trading only from 2021Q4 (the second bull run period) but I want to include this as =1 well.

This is my df.

id  date    value  other variables..  
A   2019Q4     2 
A   2020Q4     2   
A   2021Q4     3
B   2018Q4     2
B   2019Q4     0
B   2020Q4     0
B   2021Q4     4
C   2020Q4     3
C   2021Q4     4
D   2021Q4     4
E   2018Q4     3
E   2019Q4     0
E   2020Q4     0
E   2021Q4     2
.       .  

desired output would be

id  dummy
A     0
B     1
C     0
D     1
E     1
.       .  

CodePudding user response:

You can test if both values not equal 0 and test both quarters, compare (thanks mozway for improvement) and last aggregate GroupBy.all for test if all Trues per groups:

m1 = df['value'].ne(0)
m2 = df['date'].isin(['2018Q4','2021Q4'])

df1 = (m1 == m2).groupby(df['id']).all().astype(int).reset_index(name='dummy')
print (df1)
  id  dummy
0  A      0
1  B      1
2  C      0
3  D      1
4  E      1
  • Related