Home > Back-end >  Getting column with days from first observed - trading data
Getting column with days from first observed - trading data

Time:08-03

So i have some Trading data, and ultimately i would like to find 1 week return 1 month, 3 month and so on. To do this i think it is easiest if i can get a column with days existed for each asset.

Data is simplified to this:

Id Date Price
101 2014-09-14 0.7
101 2014-09-15 0.8
101 2014-09-16 0.9
101 2014-09-17 0.8
103 2014-10-03 1,6
103 2014-10-04 1.7
103 2014-10-05 1.7
103 2014-10-07 1.9
107 2015-10-17 24
107 2015-10-18 21
107 2015-10-23 22

I would then like an new column named "Days listed"

Id Date Price Days listed
101 2014-09-14 0.7 1
101 2014-09-15 0.8 2
101 2014-09-16 0.9 3
101 2014-09-17 0.8 4
103 2014-10-03 1,6 1
103 2014-10-04 1.7 2
103 2014-10-05 1.7 3
103 2014-10-07 1.9 5
107 2015-10-17 24 1
107 2015-10-18 21 2
107 2015-10-23 22 7

i think it needs to be grouped by id, but other than that i am out of ideas. Hope someone can help me Best regards the confused student

CodePudding user response:

use pd.groupby to get a earliest date for each group, which is then subtracted by the dates to get the days-listed

df['Date']=pd.to_datetime(df['Date'])
df['DaysListed']=(df['Date'] - df.groupby('Id')['Date'].transform('min')).dt.days 1
df
    Id  Date       Price    DaysListed
0   101 2014-09-14  0.7     1
1   101 2014-09-15  0.8     2
2   101 2014-09-16  0.9     3
3   101 2014-09-17  0.8     4
4   103 2014-10-03  1.6     1
5   103 2014-10-04  1.7     2
6   103 2014-10-05  1.7     3
7   103 2014-10-07  1.9     5
8   107 2015-10-17  24      1
9   107 2015-10-18  21      2
10  107 2015-10-23  22      7

CodePudding user response:

import pandas as pd
from io import StringIO

data = StringIO("""Id;Date;Price
101;2014-09-14;0.7
101;2014-09-15;0.8
101;2014-09-16;0.9
101;2014-09-17;0.8
103;2014-10-03;1.6
103;2014-10-04;1.7
103;2014-10-05;1.7
103;2014-10-07;1.9
107;2015-10-17;24
107;2015-10-18;21
107;2015-10-23;22
""")

df = pd.read_csv(data, sep=';')
df['Date'] = pd.to_datetime(df['Date'])
df['Price'] = df['Price'].astype(float)

df = df.assign(Days_list = df.groupby('Id').cumcount()   1)

Output:

     Id        Date  Price  Days_list
0   101  2014-09-14    0.7          1
1   101  2014-09-15    0.8          2
2   101  2014-09-16    0.9          3
3   101  2014-09-17    0.8          4
4   103  2014-10-03    1.6          1
5   103  2014-10-04    1.7          2
6   103  2014-10-05    1.7          3
7   103  2014-10-07    1.9          4
8   107  2015-10-17   24.0          1
9   107  2015-10-18   21.0          2
10  107  2015-10-23   22.0          3
  • Related