Considering a Saturday in an if condition-CodePudding

I have a problem. I am calculating when the next day will arrive. I want to take into account that if today's date is a Saturday, it should subtract -1 from lastindays, as no parcels arrive on Sundays. I have tried this, but I don't know how to specify two values in my Apply function: lastindays which should be manipulated and returned and week on which the if statement happens.

Dataframe

    customerId    fromDate otherInformation
0            1  2021-02-22              Cat
1            1  2021-02-20              Dog
2            1  2021-03-18          Elefant
3            1  2021-03-18              Cat
4            1  2021-03-18              Cat
5            1  2021-03-22              Cat
6            1  2021-02-10              Cat
7            1  2021-09-07              Cat
8            1        None          Elefant
9            1  2022-01-18             Fish
10           2  2021-05-17             Fish

Code

import pandas as pd


d = {'customerId': [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2],
     'fromDate': ['2021-02-22','2021-02-20', '2021-03-18','2021-03-18', '2021-03-18', '2021-03-22', 
'2021-02-10', '2021-09-07', None, '2022-01-18', '2021-05-17'],
     'otherInformation': ['Cat', 'Dog', 'Elefant', 'Cat', 'Cat','Cat', 'Cat', 'Cat', 'Elefant', 'Fish', 'Fish']
    }
df = pd.DataFrame(data=d)
print(df)
df['fromDate'] = pd.to_datetime(df['fromDate'], errors='coerce')
#df['lastindays'] = df['fromDate'] - df.sort_values('fromDate').groupby('customerId')['fromDate'].shift()
#print(df)

df['fromDate'] = pd.to_datetime(df['fromDate'], errors='coerce')
df['lastindays'] = df['fromDate'] - df.sort_values('fromDate').drop_duplicates(['customerId','fromDate']).groupby('customerId')['fromDate'].shift()
df['lastindays'] = df.groupby(['customerId','fromDate'])['lastindays'].ffill()

df = df[df['lastindays'].notna()]
df['lastindays_int'] = df['lastindays'].dt.days.astype('int')
df['week'] = df['fromDate'].dt.day_of_week
df['lastindays'] = df['lastindays'].apply(lambda x: x-1 if x == 5 else x)
df

What I have


   customerId   fromDate otherInformation lastindays  lastindays_int  week
0           1 2021-02-22              Cat     2 days               2     0
1           1 2021-02-20              Dog    10 days              10     5
2           1 2021-03-18          Elefant    24 days              24     3
3           1 2021-03-18              Cat    24 days              24     3
4           1 2021-03-18              Cat    24 days              24     3
5           1 2021-03-22              Cat     4 days               4     0
7           1 2021-09-07              Cat   169 days             169     1
9           1 2022-01-18             Fish   133 days             133     1

What I want

   customerId   fromDate otherInformation lastindays  lastindays_int  week
0           1 2021-02-22              Cat     2 days               2     0
1           1 2021-02-20              Dog    10 days              10     4 # from 5 -> 4
2           1 2021-03-18          Elefant    24 days              24     3
3           1 2021-03-18              Cat    24 days              24     3
4           1 2021-03-18              Cat    24 days              24     3
5           1 2021-03-22              Cat     4 days               4     0
7           1 2021-09-07              Cat   169 days             169     1
9           1 2022-01-18             Fish   133 days             133     1

CodePudding user response：

You can use axis =1 to acess full row at once.

def date_check(cell):
    return pd.to_datetime(cell, format='%Y-%m-%d').strftime('%a') == 'Sat'


def claculate(row):
    check = date_check(row['fromDate'])
    if check:
        row['week']-1
    return row['week']


df['week'] = df['week'].astype(int)
df['week'] = df.apply(lambda row: claculate(row), axis=1)

CodePudding user response：

df['week'] = df['fromDate'].dt.dayofweek

def func_data(x):
    if x == 5:
        x -=1
    return x

df['week'] = df['week'].apply(func_data)

Output

   customerId   fromDate otherInformation lastindays  lastindays_int  week
0           1 2021-02-22              Cat     2 days               2     0
1           1 2021-02-20              Dog    10 days              10     4
2           1 2021-03-18          Elefant    24 days              24     3
3           1 2021-03-18              Cat    24 days              24     3
4           1 2021-03-18              Cat    24 days              24     3
5           1 2021-03-22              Cat     4 days               4     0
7           1 2021-09-07              Cat   169 days             169     1
9           1 2022-01-18             Fish   133 days             133     1

Or use your variant, but instead of 'dt.day_of_week' use 'dt.dayofweek '. And also in apply you need to take the column ' df['week']'.

df['week'] = df['fromDate'].dt.dayofweek
df['week'] = df['week'].apply(lambda x: x-1 if x == 5 else x)