Home > Software engineering >  How to add positive and negative increments to every row based on a specific date?
How to add positive and negative increments to every row based on a specific date?

Time:10-10

I have a pandas df which has 2 columns such as Date, First_Date (constant).

I am trying to add a new column in which the value will be 0 where First_Date=Date. Then, all rows below that instance should increment in a negative way such as -1, -2, -3 etc.. and same should be true for rows above should increment in a positive way such as 1,2,3,4 etc. Please see attachment for concept visualization.

I am not sure if there is a pandas function to do this or if a function is better in this case. Any guidance would be great.

enter image description here

CodePudding user response:

>>> df = pd.DataFrame({'Date':pd.date_range('2020-01-01', '2020-01-18')})
>>> df
         Date
0  2020-01-01
1  2020-01-02
2  2020-01-03
3  2020-01-04
4  2020-01-05
5  2020-01-06
6  2020-01-07
7  2020-01-08
8  2020-01-09
9  2020-01-10
10 2020-01-11
11 2020-01-12
12 2020-01-13
13 2020-01-14
14 2020-01-15
15 2020-01-16
16 2020-01-17
17 2020-01-18

Checkout pandas Timestamps strings to DateTime etc. No reason to work with strings and ints.

>>> df['index'] = df - pd.Timestamp('2020-01-11')
>>> df
         Date    index
0  2020-01-01 -10 days
1  2020-01-02  -9 days
2  2020-01-03  -8 days
3  2020-01-04  -7 days
4  2020-01-05  -6 days
5  2020-01-06  -5 days
6  2020-01-07  -4 days
7  2020-01-08  -3 days
8  2020-01-09  -2 days
9  2020-01-10  -1 days
10 2020-01-11   0 days
11 2020-01-12   1 days
12 2020-01-13   2 days
13 2020-01-14   3 days
14 2020-01-15   4 days
15 2020-01-16   5 days
16 2020-01-17   6 days
17 2020-01-18   7 days

You can get your desired ints afterwards with:

>>> df['index'].transform(lambda x: x.days)
0    -10
1     -9
2     -8
3     -7
4     -6
5     -5
6     -4
7     -3
8     -2
9     -1
10     0
11     1
12     2
13     3
14     4
15     5
16     6
17     7

EDIT

To answer more specifically since you have string dates you have to do the following first

df[['Date', 'First_Date']] = df[['Date', 'First_Date']].astype('datetime64[ns]')

then you can subtract the columns and get your result:

df['index'] = df['Date'] - df['First_Date']

CodePudding user response:

import datetime

dates=['2020-01-01','2020-01-02',...] #list of strings
constant='2020-01-11'
const=datetime.datetime.strptime(constant,'%Y-%m-%d')
lista=[(x,constant,(datetime.datetime.strptime(x,'%Y-%m-%d')-const).days) 
for x in dates] #lista is a list of tuples [(date,constant,days)..]

hope it works!

  • Related