Home > Software engineering >  Casting a value based on a trigger in pandas
Casting a value based on a trigger in pandas

Time:09-15

I would like to create a new column every time I get 1 in the 'Signal' column that will cast the corresponding value from the 'Value' column (please see the expected output below).

Initial data:

Index Value Signal
0 3 0
1 8 0
2 8 0
3 7 1
4 9 0
5 10 0
6 14 1
7 10 0
8 10 0
9 4 1
10 10 0
11 10 0

Expected Output:

Index Value Signal New_Col_1 New_Col_2 New_Col_3
0 3 0 0 0 0
1 8 0 0 0 0
2 8 0 0 0 0
3 7 1 7 0 0
4 9 0 7 0 0
5 10 0 7 0 0
6 14 1 7 14 0
7 10 0 7 14 0
8 10 0 7 14 0
9 4 1 7 14 4
10 10 0 7 14 4
11 10 0 7 14 4

What would be a way to do it?

CodePudding user response:

You can use a pivot:

out = df.join(df
   # keep only the values where signal is 1
   # and get Signal's cumsum
   .assign(val=df['Value'].where(df['Signal'].eq(1)),
           col=df['Signal'].cumsum()
           )
   # pivot cumsumed Signal to columns
   .pivot(index='Index', columns='col', values='val')
   # ensure column 0 is absent (using loc to avoid KeyError)
   .loc[:, 1:]
   # forward fill the values
   .ffill()
   # rename columns
   .add_prefix('New_Col_')
)

output:

    Index  Value  Signal  New_Col_1  New_Col_2  New_Col_3
0       0      3       0        NaN        NaN        NaN
1       1      8       0        NaN        NaN        NaN
2       2      8       0        NaN        NaN        NaN
3       3      7       1        7.0        NaN        NaN
4       4      9       0        7.0        NaN        NaN
5       5     10       0        7.0        NaN        NaN
6       6     14       1        7.0       14.0        NaN
7       7     10       0        7.0       14.0        NaN
8       8     10       0        7.0       14.0        NaN
9       9      4       1        7.0       14.0        4.0
10     10     10       0        7.0       14.0        4.0
11     11     10       0        7.0       14.0        4.0

CodePudding user response:

#create new column by incrementing the rows that has signal
df['new_col']='new_col_' df['Signal'].cumsum().astype(str)

#rows having no signal, make them null
df['new_col'] = df['new_col'].mask(df['Signal']==0, '0')

#pivot table
df2=(df.pivot(index=['Index','Signal', 'Value'], columns='new_col', values='Value')
 .reset_index()
 .ffill().fillna(0)  #forward fill and fillna with 0
 .drop(columns=['0','Index'] ) #drop the extra columns
 .rename_axis(columns={'new_col':'Index'}) # rename the axis
 .astype(int)) # changes values to int, removing decimals

df2


Index   Signal  Value   new_col_1   new_col_2   new_col_3
0         0        3       0       0              0
1         0        8       0       0              0
2         0        8       0       0              0
3         1        7       7       0              0
4         0        9       7       0              0
5         0        10      7       0              0
6         1        14      7       14             0
7         0        10      7       14             0
8         0        10      7       14             0
9         1        4       7       14             4  
10        0        10      7       14             4
11        0        10      7       14             4
  • Related