If a condition is met then add to the previous row elseif subtract from the previous row in python-CodePudding

I'm working on a problem where if certain conditions are met then I need to add or subtract from the previous row. So I have the following df:

data = {'sample_val':[5.5, 6.2 , 4.0, 7.8, 3.6], 'sample_lab':['nor','high','nor','high','low']}
df=pd.DataFrame(data)

I want to add an additional column 'adjustment', that is filled by the following rules and has a starting value of 10:

if sample_val < 4 then - 6 from the previous adjustment entry
if 4<= sample_val <= 5 then 0 to the previous adjustment entry
if sample_val>5 then add 4 to the previous adjustment entry

I've tried iterrows( ), but I can't get it to add or subtract. I also tried np.where, but can't get it to iterate.

CodePudding user response：

If I understood correctly, you can use np.select to know which coefficients you must add or subtract, and then accumulate the result to the initial value of 10 using panda's cumsum:

import numpy as np

x = df.sample_val.to_numpy()
adj = np.select([x<4, (x>=4)&(x<=5), x>5], [-6,0,4], x)
df['adjustment'] = 0
df.loc[0, 'adjustment'] = 10
df['adjustment'] = df.adjustment.add(adj).cumsum()

print(df)
   sample_val sample_lab  adjustment
0         5.5        nor        14.0
1         6.2       high        18.0
2         4.0        nor        18.0
3         7.8       high        22.0
4         3.6        low        16.0

CodePudding user response：

Use pandas.cut to assign numbers to bins and cumsum 10 for the cumulated sum starting at 10:

df['adjustment'] = (pd.cut(df['sample_val'],
                           bins=[-np.inf, 4, 5, np.inf],
                           labels=[-6, 0, 4])
                      .astype(int)
                      .cumsum()
                   )   10

output:

   sample_val sample_lab  adjustment
0         5.5        nor          14
1         6.2       high          18
2         4.0        nor          12
3         7.8       high          16
4         3.6        low          10

CodePudding user response：

There are other approaches that provide concise solutions however as you stated that you could not get this to work with itterows, this solution shows you how it can be done. This approach is also useful if you need to add additional logic or perform other calculations.

You can use Pandas iterrows and loc functions. See the Pandas documentation for iterrows and loc to learn more.

The following assumes you are starting with a value of 10. You are then applying the rules before setting the value of the adjustment column.

# add the additional column
data_df['adjustment'] = 0

# set the initial adjustment
adjustment = 10

# adjust the values of the column
for index, row in data_df.iterrows():
    sample_val = float(row['sample_val'])
    
    if sample_val < 4:
      adjustment -= 6
    elif sample_val > 5:
      adjustment  = 4
    
    # set the column value
    data_df.loc[index, 'adjustment'] = adjustment
  
    
print(data_df)

This would produce the following output for your example:

   sample_val sample_lab  adjustment
0         5.5        nor          14
1         6.2       high          18
2         4.0        nor          18
3         7.8       high          22
4         3.6        low          16

Alternatively, if you require the first row to have a value of 10 in the adjustment column then you will need to adjust the code as follows:

# add the additional column
data_df['adjustment'] = 0

# set the initial adjustment
adjustment = 10

# adjust the values of the column
for index, row in data_df.iterrows():
    sample_val = float(row['sample_val'])
    
    if index == 0:
      pass
    elif sample_val < 4:
      adjustment -= 6
    elif sample_val > 5:
      adjustment  = 4
    
    # set the column value
    data_df.loc[index, 'adjustment'] = adjustment
  
    
print(data_df)

This would produce the following result:

   sample_val sample_lab  adjustment
0         5.5        nor          10
1         6.2       high          14
2         4.0        nor          14
3         7.8       high          18
4         3.6        low          12