Home > Net >  Assign row value as new column based on condition - python
Assign row value as new column based on condition - python

Time:07-28

I've got a df displaying a time series. I'm hoping to create two new columns. Where Color is equal to Red, I look at the next 3 time intervals in Val and assign the lowest value to a new column (Low_Val).

This works fine but I'm aiming to also include that respective timestamp as Low_Time

import numpy as np
import pandas as pd

d = ({
    'Date' : ['2022-07-25 01:46:00 01:00','2022-07-25 01:47:00 01:00','2022-07-25 01:48:00 01:00','2022-07-25 01:49:00 01:00','2022-07-25 01:50:00 01:00','2022-07-25 01:51:00 01:00','2022-07-25 01:52:00 01:00','2022-07-25 01:53:00 01:00'],                 
    'Val' : [5,1,4,-2,8,4,3,1],   
    'Color' : ['Red', np.NaN, np.NaN, np.NaN, 'Red', np.NaN, 'Red', np.NaN],                      
    })

df = pd.DataFrame(data = d)

mask = (df['Color'] == 'Red')

df['Low_Val'] = np.where(mask, (df['Val'][::-1].rolling(3, min_periods = 1).min()[::-1]), np.nan)

df['Low_Time'] = df[df['Low'] == df['Val']]['Date']

intended output:

                        Date  Val Color  Low_Val                  Low_Time
0  2022-07-25 01:46:00 01:00    5   Red      1.0 2022-07-25 01:47:00 01:00
1  2022-07-25 01:47:00 01:00    1   NaN      NaN                       NaN
2  2022-07-25 01:48:00 01:00    4   NaN      NaN                       NaN
3  2022-07-25 01:49:00 01:00   -2   NaN      NaN                       NaN
4  2022-07-25 01:50:00 01:00    8   Red      3.0 2022-07-25 01:52:00 01:00
5  2022-07-25 01:51:00 01:00    4   NaN      NaN                       NaN
6  2022-07-25 01:52:00 01:00    3   Red      1.0 2022-07-25 01:53:00 01:00
7  2022-07-25 01:53:00 01:00    1   NaN      NaN                       NaN

CodePudding user response:

Convert Date to datetime

df["Date"] = pd.to_datetime(df["Date"].str.split(' ').str[0])

find index where value is red

index_list = df[df.Color == "Red"].index

Derive low val and low time

for index in index_list:
    start = index
    end = index   3
    sliced_df = df[start:end]
    low_val = min(sliced_df.loc[start:end, "Val"])
    # May need to refine the logic if there are duplicates in the sub dataframe
    low_time = sliced_df.loc[sliced_df.Val == low_val, "Date"].tolist()[0]
    df.loc[index, "Low_Val"] = low_val
    df.loc[index, "Low_Time"] = low_time
    

result

Date    Val Color   Low_Val Low_Time
0   2022-07-25 01:46:00 5   Red 1.0 2022-07-25 01:47:00
1   2022-07-25 01:47:00 1   NaN NaN NaT
2   2022-07-25 01:48:00 4   NaN NaN NaT
3   2022-07-25 01:49:00 -2  NaN NaN NaT
4   2022-07-25 01:50:00 8   Red 3.0 2022-07-25 01:52:00
5   2022-07-25 01:51:00 4   NaN NaN NaT
6   2022-07-25 01:52:00 3   Red 1.0 2022-07-25 01:53:00
7   2022-07-25 01:53:00 1   NaN NaN NaT

CodePudding user response:

IIUC, this should work for you:

d = ({
    'Date' : ['2022-07-25 01:46:00 01:00','2022-07-25 01:47:00 01:00','2022-07-25 01:48:00 01:00','2022-07-25 01:49:00 01:00','2022-07-25 01:50:00 01:00','2022-07-25 01:51:00 01:00','2022-07-25 01:52:00 01:00','2022-07-25 01:53:00 01:00'],                 
    'Val' : [5,1,4,-2,8,4,3,1],   
    'Color' : ['Red', np.NaN, np.NaN, np.NaN, 'Red', np.NaN, 'Red', np.NaN],                      
    })
df = pd.DataFrame(data = d)
mask = (df['Color'] == 'Red')
df['Low_Val'] = np.where(mask, (df['Val'][::-1].rolling(3, min_periods = 1).min()[::-1]), np.nan)

df.Date = pd.to_datetime(df.Date)
min_dates = []
for _, row  in df.assign(D1 = lambda df: df.Date.shift(-1)).assign(D2 = lambda df: df.Date.shift(-2)).assign(D3 = lambda df: df.Date.shift(-3)).iterrows():
    if row.Color == 'Red':
        min_dates.append(min([row.D1, row.D2, row.D3]))
    else:
        min_dates.append(np.nan)
df['min_dates'] = min_dates
print(df)

Output:

                       Date  Val Color  Low_Val                 min_dates
0 2022-07-25 01:46:00 01:00    5   Red      1.0 2022-07-25 01:47:00 01:00
1 2022-07-25 01:47:00 01:00    1   NaN      NaN                       NaT
2 2022-07-25 01:48:00 01:00    4   NaN      NaN                       NaT
3 2022-07-25 01:49:00 01:00   -2   NaN      NaN                       NaT
4 2022-07-25 01:50:00 01:00    8   Red      3.0 2022-07-25 01:51:00 01:00
5 2022-07-25 01:51:00 01:00    4   NaN      NaN                       NaT
6 2022-07-25 01:52:00 01:00    3   Red      1.0 2022-07-25 01:53:00 01:00
7 2022-07-25 01:53:00 01:00    1   NaN      NaN                       NaT
  • Related