If then statements with dates in Python-CodePudding

I would like to create an if statement where, if the date equals the start of any month (2021-01-01, 2021-02-01, 2021-03-01 etc.) then I would like to replace the value in column X to 0. Is this possible or is there a better way to go about this.

This is the dataset that I am working with:

df = [('2021-04-25', 4, 4),
           ('2021-04-26', 0, 0),
           ('2021-04-27', 0, 0),
           ('2021-04-28', 56, 53),
           ('2021-04-29', 3, 0),
           ('2021-04-30', 4, 0),
           ('2021-05-01', 5, 0),
           ('2021-05-02', 5, 0),
           ('2021-05-03', 10, 5),
           ('2021-05-04', 5, 0)]
df = pd.DataFrame(stack_out, columns=['Date', 'X', 'Y'])

desired output:

desired = [('2021-04-25', 4, 4),
           ('2021-04-26', 0, 0),
           ('2021-04-27', 0, 0),
           ('2021-04-28', 56, 53),
           ('2021-04-29', 3, 0),
           ('2021-04-30', 4, 0),
           ('2021-05-01', 0, 0),
           ('2021-05-02', 5, 0),
           ('2021-05-03', 10, 5),
           ('2021-05-04', 5, 0)]
desired = pd.DataFrame(stack_out, columns=['Date', 'X', 'Y'])

CodePudding user response：

Here's a simple one-liner which will do the job:

df.loc[pd.to_datetime(df['Date']).dt.day == 1, 'X'] = 0

Essentially, a filter is applied to the X column to subset to values where the day is equal to 1. And change these values to zero.

Output:

         Date   X   Y
0  2021-04-25   4   4
1  2021-04-26   0   0
2  2021-04-27   0   0
3  2021-04-28  56  53
4  2021-04-29   3   0
5  2021-04-30   4   0
6  2021-05-01   0   0  <-- Change applied here.
7  2021-05-02   5   0
8  2021-05-03  10   5
9  2021-05-04   5   0

CodePudding user response：

I recommend you to use the iterrows method, which allows you to iterate through the rows of a given dataframe. Let's call your dataframe by df. Assuming the type of your "Date" column is Timestamp. The following works

new_Column_X = list()
for index,row in df.iterrows():
    ts = row['Date']
    if ts.day==1:
        new_Column_X.append(0)
    else:
        new_Column_X.append(row['X'])

# If wanted you can just replace the old column with the new one
df['New_Column_X'] = new_Column_X

Second Solution: As @S3DEV mentioned you can solve this problem using the apply method (which is probably faster than the first solution).

from pandas import Timestamp
def column_X_parser(ts, x):
    if ts.day==1:
        return 0
    else:
        return x

df['New_Column_X'] = df.apply(lambda row: column_X_parser(Timestamp(row['Date']), row['X']),axis=1)