I want to make a table that has a time range for each id in a list, I have the following code that does it correctly but I think it is not the most efficient or pythonic, is there a better way to do it?
My code:
ids = [1, 2, 3, 4, 5]
df = pd.DataFrame()
for id in ids:
df_aux = pd.DataFrame(
{'init': pd.date_range(start='2022-04-06 10:00:00', end='2022-04-06 15:00:00', freq="H"),
"color": "#FF0000",
"id": id}
)
df_aux['end'] = df_aux['init'] pd.Timedelta(seconds=3600)
df = pd.concat([df,df_aux], ignore_index=True)
Correct result:
fecha_inicio color id fecha_fin
0 2022-04-06 10:00:00 #FF0000 1 2022-04-06 11:00:00
1 2022-04-06 11:00:00 #FF0000 1 2022-04-06 12:00:00
2 2022-04-06 12:00:00 #FF0000 1 2022-04-06 13:00:00
3 2022-04-06 13:00:00 #FF0000 1 2022-04-06 14:00:00
4 2022-04-06 14:00:00 #FF0000 1 2022-04-06 15:00:00
5 2022-04-06 15:00:00 #FF0000 1 2022-04-06 16:00:00
6 2022-04-06 10:00:00 #FF0000 2 2022-04-06 11:00:00
7 2022-04-06 11:00:00 #FF0000 2 2022-04-06 12:00:00
8 2022-04-06 12:00:00 #FF0000 2 2022-04-06 13:00:00
9 2022-04-06 13:00:00 #FF0000 2 2022-04-06 14:00:00
10 2022-04-06 14:00:00 #FF0000 2 2022-04-06 15:00:00
11 2022-04-06 15:00:00 #FF0000 2 2022-04-06 16:00:00
12 2022-04-06 10:00:00 #FF0000 3 2022-04-06 11:00:00
13 2022-04-06 11:00:00 #FF0000 3 2022-04-06 12:00:00
14 2022-04-06 12:00:00 #FF0000 3 2022-04-06 13:00:00
15 2022-04-06 13:00:00 #FF0000 3 2022-04-06 14:00:00
16 2022-04-06 14:00:00 #FF0000 3 2022-04-06 15:00:00
17 2022-04-06 15:00:00 #FF0000 3 2022-04-06 16:00:00
18 2022-04-06 10:00:00 #FF0000 4 2022-04-06 11:00:00
19 2022-04-06 11:00:00 #FF0000 4 2022-04-06 12:00:00
20 2022-04-06 12:00:00 #FF0000 4 2022-04-06 13:00:00
21 2022-04-06 13:00:00 #FF0000 4 2022-04-06 14:00:00
22 2022-04-06 14:00:00 #FF0000 4 2022-04-06 15:00:00
23 2022-04-06 15:00:00 #FF0000 4 2022-04-06 16:00:00
24 2022-04-06 10:00:00 #FF0000 5 2022-04-06 11:00:00
25 2022-04-06 11:00:00 #FF0000 5 2022-04-06 12:00:00
26 2022-04-06 12:00:00 #FF0000 5 2022-04-06 13:00:00
27 2022-04-06 13:00:00 #FF0000 5 2022-04-06 14:00:00
28 2022-04-06 14:00:00 #FF0000 5 2022-04-06 15:00:00
29 2022-04-06 15:00:00 #FF0000 5 2022-04-06 16:00:00
CodePudding user response:
Try with concat
:
df = pd.DataFrame({"init": pd.date_range(start='2022-04-06 10:00:00', end='2022-04-06 15:00:00', freq="H"),
"end": pd.date_range(start='2022-04-06 11:00:00', end='2022-04-06 16:00:00', freq="H"),
"color": "#FF0000"})
output = pd.concat([df.assign(id=x) for x in ids], ignore_index=True)
>>> output
init end color id
0 2022-04-06 10:00:00 2022-04-06 11:00:00 #FF0000 1
1 2022-04-06 11:00:00 2022-04-06 12:00:00 #FF0000 1
2 2022-04-06 12:00:00 2022-04-06 13:00:00 #FF0000 1
3 2022-04-06 13:00:00 2022-04-06 14:00:00 #FF0000 1
4 2022-04-06 14:00:00 2022-04-06 15:00:00 #FF0000 1
5 2022-04-06 15:00:00 2022-04-06 16:00:00 #FF0000 1
6 2022-04-06 10:00:00 2022-04-06 11:00:00 #FF0000 2
7 2022-04-06 11:00:00 2022-04-06 12:00:00 #FF0000 2
8 2022-04-06 12:00:00 2022-04-06 13:00:00 #FF0000 2
9 2022-04-06 13:00:00 2022-04-06 14:00:00 #FF0000 2
10 2022-04-06 14:00:00 2022-04-06 15:00:00 #FF0000 2
11 2022-04-06 15:00:00 2022-04-06 16:00:00 #FF0000 2
12 2022-04-06 10:00:00 2022-04-06 11:00:00 #FF0000 3
13 2022-04-06 11:00:00 2022-04-06 12:00:00 #FF0000 3
14 2022-04-06 12:00:00 2022-04-06 13:00:00 #FF0000 3
15 2022-04-06 13:00:00 2022-04-06 14:00:00 #FF0000 3
16 2022-04-06 14:00:00 2022-04-06 15:00:00 #FF0000 3
17 2022-04-06 15:00:00 2022-04-06 16:00:00 #FF0000 3
18 2022-04-06 10:00:00 2022-04-06 11:00:00 #FF0000 4
19 2022-04-06 11:00:00 2022-04-06 12:00:00 #FF0000 4
20 2022-04-06 12:00:00 2022-04-06 13:00:00 #FF0000 4
21 2022-04-06 13:00:00 2022-04-06 14:00:00 #FF0000 4
22 2022-04-06 14:00:00 2022-04-06 15:00:00 #FF0000 4
23 2022-04-06 15:00:00 2022-04-06 16:00:00 #FF0000 4
24 2022-04-06 10:00:00 2022-04-06 11:00:00 #FF0000 5
25 2022-04-06 11:00:00 2022-04-06 12:00:00 #FF0000 5
26 2022-04-06 12:00:00 2022-04-06 13:00:00 #FF0000 5
27 2022-04-06 13:00:00 2022-04-06 14:00:00 #FF0000 5
28 2022-04-06 14:00:00 2022-04-06 15:00:00 #FF0000 5
29 2022-04-06 15:00:00 2022-04-06 16:00:00 #FF0000 5
CodePudding user response:
A quick way to do this would be like this:
import pandas as pd
ids = [1, 2, 3, 4, 5]
time = pd.date_range(start='2022-04-06 10:00:00',
end='2022-04-06 15:00:00', freq="H")
df = pd.DataFrame({'init': list(time)*len(ids),
'color': '#FF0000',
'id': sorted(ids*len(time))})
df['end'] = df['init'] pd.Timedelta(seconds=3600)
No for loops are needed, multiplying a list by x
will repeat it x
times. Sorting the repeated ids
will ensure they are in the correct order (rather than [1, 2, 3, 4, 5, 1, 2, 3, 4, ...]
they will be [1, 1, 1, 1, ..., 2, 2, 2, ...]
).