I have a pandas df which has 2 columns such as Date, First_Date (constant)
.
I am trying to add a new column in which the value will be 0 where First_Date=Date. Then, all rows below that instance should increment in a negative way such as -1, -2, -3 etc.. and same should be true for rows above should increment in a positive way such as 1,2,3,4 etc. Please see attachment for concept visualization.
I am not sure if there is a pandas function to do this or if a function is better in this case. Any guidance would be great.
CodePudding user response:
>>> df = pd.DataFrame({'Date':pd.date_range('2020-01-01', '2020-01-18')})
>>> df
Date
0 2020-01-01
1 2020-01-02
2 2020-01-03
3 2020-01-04
4 2020-01-05
5 2020-01-06
6 2020-01-07
7 2020-01-08
8 2020-01-09
9 2020-01-10
10 2020-01-11
11 2020-01-12
12 2020-01-13
13 2020-01-14
14 2020-01-15
15 2020-01-16
16 2020-01-17
17 2020-01-18
Checkout pandas Timestamps strings to DateTime etc. No reason to work with strings and ints.
>>> df['index'] = df - pd.Timestamp('2020-01-11')
>>> df
Date index
0 2020-01-01 -10 days
1 2020-01-02 -9 days
2 2020-01-03 -8 days
3 2020-01-04 -7 days
4 2020-01-05 -6 days
5 2020-01-06 -5 days
6 2020-01-07 -4 days
7 2020-01-08 -3 days
8 2020-01-09 -2 days
9 2020-01-10 -1 days
10 2020-01-11 0 days
11 2020-01-12 1 days
12 2020-01-13 2 days
13 2020-01-14 3 days
14 2020-01-15 4 days
15 2020-01-16 5 days
16 2020-01-17 6 days
17 2020-01-18 7 days
You can get your desired ints afterwards with:
>>> df['index'].transform(lambda x: x.days)
0 -10
1 -9
2 -8
3 -7
4 -6
5 -5
6 -4
7 -3
8 -2
9 -1
10 0
11 1
12 2
13 3
14 4
15 5
16 6
17 7
EDIT
To answer more specifically since you have string dates you have to do the following first
df[['Date', 'First_Date']] = df[['Date', 'First_Date']].astype('datetime64[ns]')
then you can subtract the columns and get your result:
df['index'] = df['Date'] - df['First_Date']
CodePudding user response:
import datetime
dates=['2020-01-01','2020-01-02',...] #list of strings
constant='2020-01-11'
const=datetime.datetime.strptime(constant,'%Y-%m-%d')
lista=[(x,constant,(datetime.datetime.strptime(x,'%Y-%m-%d')-const).days)
for x in dates] #lista is a list of tuples [(date,constant,days)..]
hope it works!