I have a dataframe looking like this:
Weekday Day_in_Month Starting_hour Ending_hour Power
3 1 1 3 35
3 1 3 7 15
4 2 22 2 5
.
.
.
I want to duplicate every column until the Starting_hour matches the Ending_hour.
-> All values of the row should be the same, but the Starting_hour value should change by Starting_hour 1 for every new row.
The final dataframe should look like the following:
Weekday Day_in_Month Starting_hour Ending_hour Power
3 1 1 3 35
3 1 2 3 35
3 1 3 3 35
3 1 3 7 15
3 1 4 7 15
3 1 5 7 15
3 1 6 7 15
3 1 7 7 15
4 2 22 2 5
4 2 23 2 5
4 2 24 2 5
4 2 1 2 5
4 2 2 2 5
I appreciate any ideas on it, thanks!
CodePudding user response:
Use Index.repeat
with subtracted values and repeat rows by DataFrame.loc
, then add counter to Starting_hour
by GroupBy.cumcount
:
df1 = df.loc[df.index.repeat(df['Ending_hour'].sub(df['Starting_hour']).add(1))]
df1['Starting_hour'] = df1.groupby(level=0).cumcount()
df1 = df1.reset_index(drop=True)
print (df1)
EDIT: If possible greater Starting_hour
add 24
to Ending_hour
, then in last step remove 1
for starting hours by 0
, use modulo by 24
and last add 1
:
m = df['Starting_hour'].gt(df['Ending_hour'])
e = df['Ending_hour'].mask(m, df['Ending_hour'].add(24))
df1 = df.loc[df.index.repeat(e.sub(df['Starting_hour']).add(1))]
df1['Starting_hour'] = (df1['Starting_hour'].add(df1.groupby(level=0).cumcount())
.sub(1).mod(24).add(1))
df1 = df1.reset_index(drop=True)
print (df1)
Weekday Day_in_Month Starting_hour Ending_hour Power
0 3 1 1 3 35
1 3 1 2 3 35
2 3 1 3 3 35
3 3 1 3 7 15
4 3 1 4 7 15
5 3 1 5 7 15
6 3 1 6 7 15
7 3 1 7 7 15
8 4 2 22 2 5
9 4 2 23 2 5
10 4 2 24 2 5
11 4 2 1 2 5
12 4 2 2 2 5