Home > Back-end >  How to duplicate rows in a Pandas dataframe so that each day between 2 timestamps (from 2 columns) h
How to duplicate rows in a Pandas dataframe so that each day between 2 timestamps (from 2 columns) h

Time:10-07

I have a 80k-rows dataframeand here are 2 rows as example

 Starttimeofvalidity Endtimeofvalidity    PointName  tariff
0          2017-10-01        2017-11-01        SUDAL   4.830
1          2017-11-01        2017-12-01  Baumgarten    2.768

Is there an easy way to make each date between the 2 timestamps having its own row while replicate the numbers? (the number divided by the number of the days between the 2 timestamps works too)

I wish to get a dataframe like this

         Date    PointName  tariff
0  2017-10-01        SUDAL   4.830
1  2017-10-02        SUDAL   4.830
2  2017-10-03        SUDAL   4.830
3  2017-10-04        SUDAL   4.830
4  2017-10-05        SUDAL   4.830
5  2017-10-06        SUDAL   4.830
6  2017-10-07        SUDAL   4.830
7  2017-10-08        SUDAL   4.830
8  2017-10-09        SUDAL   4.830
9  2017-10-10        SUDAL   4.830
10 2017-10-11        SUDAL   4.830
11 2017-10-12        SUDAL   4.830
12 2017-10-13        SUDAL   4.830
13 2017-10-14        SUDAL   4.830
14 2017-10-15        SUDAL   4.830
15 2017-10-16        SUDAL   4.830
16 2017-10-17        SUDAL   4.830
17 2017-10-18        SUDAL   4.830
18 2017-10-19        SUDAL   4.830
19 2017-10-20        SUDAL   4.830
20 2017-10-21        SUDAL   4.830
21 2017-10-22        SUDAL   4.830
22 2017-10-23        SUDAL   4.830
23 2017-10-24        SUDAL   4.830
24 2017-10-25        SUDAL   4.830
25 2017-10-26        SUDAL   4.830
26 2017-10-27        SUDAL   4.830
27 2017-10-28        SUDAL   4.830
28 2017-10-29        SUDAL   4.830
29 2017-10-30        SUDAL   4.830
30 2017-10-31        SUDAL   4.830
31 2017-11-01        SUDAL   4.830
32 2017-11-01  Baumgarten    2.768
33 2017-11-02  Baumgarten    2.768
34 2017-11-03  Baumgarten    2.768
35 2017-11-04  Baumgarten    2.768
36 2017-11-05  Baumgarten    2.768
37 2017-11-06  Baumgarten    2.768
38 2017-11-07  Baumgarten    2.768
39 2017-11-08  Baumgarten    2.768
40 2017-11-09  Baumgarten    2.768
41 2017-11-10  Baumgarten    2.768
42 2017-11-11  Baumgarten    2.768
43 2017-11-12  Baumgarten    2.768
44 2017-11-13  Baumgarten    2.768
45 2017-11-14  Baumgarten    2.768
46 2017-11-15  Baumgarten    2.768
47 2017-11-16  Baumgarten    2.768
48 2017-11-17  Baumgarten    2.768
49 2017-11-18  Baumgarten    2.768
50 2017-11-19  Baumgarten    2.768
51 2017-11-20  Baumgarten    2.768
52 2017-11-21  Baumgarten    2.768
53 2017-11-22  Baumgarten    2.768
54 2017-11-23  Baumgarten    2.768
55 2017-11-24  Baumgarten    2.768
56 2017-11-25  Baumgarten    2.768
57 2017-11-26  Baumgarten    2.768
58 2017-11-27  Baumgarten    2.768
59 2017-11-28  Baumgarten    2.768
60 2017-11-29  Baumgarten    2.768
61 2017-11-30  Baumgarten    2.768
62 2017-12-01  Baumgarten    2.768

CodePudding user response:

You can create a date range for each row corresponding to the dates between Starttimeofvalidity and Endtimeofvalidity by pd.date_ranage(). Then, explode on the list of dates in date ranges by .explode() to expand the lists into one row for each date, as follows:

Other steps are the cosmetic works to drop unwanted columns, rearrange the column order and reset the row index.

df['Date'] = df[['Starttimeofvalidity', 'Endtimeofvalidity']].apply(lambda x: pd.date_range(x['Starttimeofvalidity'], x['Endtimeofvalidity']), axis=1)

df_out = df.explode('Date').drop(['Starttimeofvalidity', 'Endtimeofvalidity'], axis=1)

df_out = df_out[['Date', 'PointName', 'tariff']].reset_index(drop=True)

Result:

print(df_out)

         Date   PointName  tariff
0  2017-10-01       SUDAL   4.830
1  2017-10-02       SUDAL   4.830
2  2017-10-03       SUDAL   4.830
3  2017-10-04       SUDAL   4.830
4  2017-10-05       SUDAL   4.830
5  2017-10-06       SUDAL   4.830
6  2017-10-07       SUDAL   4.830
7  2017-10-08       SUDAL   4.830
8  2017-10-09       SUDAL   4.830
9  2017-10-10       SUDAL   4.830
10 2017-10-11       SUDAL   4.830
11 2017-10-12       SUDAL   4.830
12 2017-10-13       SUDAL   4.830
13 2017-10-14       SUDAL   4.830
14 2017-10-15       SUDAL   4.830
15 2017-10-16       SUDAL   4.830
16 2017-10-17       SUDAL   4.830
17 2017-10-18       SUDAL   4.830
18 2017-10-19       SUDAL   4.830
19 2017-10-20       SUDAL   4.830
20 2017-10-21       SUDAL   4.830
21 2017-10-22       SUDAL   4.830
22 2017-10-23       SUDAL   4.830
23 2017-10-24       SUDAL   4.830
24 2017-10-25       SUDAL   4.830
25 2017-10-26       SUDAL   4.830
26 2017-10-27       SUDAL   4.830
27 2017-10-28       SUDAL   4.830
28 2017-10-29       SUDAL   4.830
29 2017-10-30       SUDAL   4.830
30 2017-10-31       SUDAL   4.830
31 2017-11-01       SUDAL   4.830
32 2017-11-01  Baumgarten   2.768
33 2017-11-02  Baumgarten   2.768
34 2017-11-03  Baumgarten   2.768
35 2017-11-04  Baumgarten   2.768
36 2017-11-05  Baumgarten   2.768
37 2017-11-06  Baumgarten   2.768
38 2017-11-07  Baumgarten   2.768
39 2017-11-08  Baumgarten   2.768
40 2017-11-09  Baumgarten   2.768
41 2017-11-10  Baumgarten   2.768
42 2017-11-11  Baumgarten   2.768
43 2017-11-12  Baumgarten   2.768
44 2017-11-13  Baumgarten   2.768
45 2017-11-14  Baumgarten   2.768
46 2017-11-15  Baumgarten   2.768
47 2017-11-16  Baumgarten   2.768
48 2017-11-17  Baumgarten   2.768
49 2017-11-18  Baumgarten   2.768
50 2017-11-19  Baumgarten   2.768
51 2017-11-20  Baumgarten   2.768
52 2017-11-21  Baumgarten   2.768
53 2017-11-22  Baumgarten   2.768
54 2017-11-23  Baumgarten   2.768
55 2017-11-24  Baumgarten   2.768
56 2017-11-25  Baumgarten   2.768
57 2017-11-26  Baumgarten   2.768
58 2017-11-27  Baumgarten   2.768
59 2017-11-28  Baumgarten   2.768
60 2017-11-29  Baumgarten   2.768
61 2017-11-30  Baumgarten   2.768
62 2017-12-01  Baumgarten   2.768
  • Related