I have I have data frame of rows with the values of open_time and close_time. I need to create column open_lvl_value where, for each row, will be shown sum of previous rows with intersected range of values. Probably I didn't explain it clear so I attached expectation and Gantt chart.
import pandas as pd
data = [[40, 110],[50, 110],[70, 110],[90, 110],
[100, 110],[40, 70],[50, 70],[50, 70],
[60, 70],[60, 70],[100, 110],[130, 180],
[150, 180],[150, 180]]
df = pd.DataFrame(data, columns=['open_time','close_time'])
print(df)
open_time close_time
0 40 110
1 50 110
2 70 110
3 90 110
4 100 110
5 40 70
6 50 70
7 50 70
8 60 70
9 60 70
10 100 110
11 130 180
12 150 180
13 150 180
expected result:
open_time close_time open_lvl_value
0 40 110 1
1 50 110 2
2 70 110 3
3 90 110 4
4 100 110 5
5 40 70 4
6 50 70 5
7 50 70 6
8 60 70 7
9 60 70 8
10 100 110 5
11 130 180 1
12 150 180 2
13 150 180 3
CodePudding user response:
You can try this:
import pandas as pd
data = [[40, 110],[50, 110],[70, 110],[90, 110],
[100, 110],[40, 70],[50, 70],[50, 70],
[60, 70],[60, 70],[100, 110],[130, 180],
[150, 180],[150, 180]]
df = pd.DataFrame(data, columns=['open_time','close_time'])
def foo(df):
df['open_lvl_value'] = 0
for row in df.itertuples():
start = row.open_time
end = row.close_time
df_copy = df.loc[:row.Index]
s1 = df_copy[(df_copy.close_time >= end) & (df_copy.open_time <= end)].shape[0]
s2 = df_copy[(df_copy.open_time <= start) & df_copy.close_time >= start].shape[0]
df.loc[row.Index, 'open_lvl_value'] = max(s1, s2)
return df
print(foo(df))
Out:
open_time close_time open_lvl_value
0 40 110 1
1 50 110 2
2 70 110 3
3 90 110 4
4 100 110 5
5 40 70 4
6 50 70 5
7 50 70 6
8 60 70 7
9 60 70 8
10 100 110 6
11 130 180 1
12 150 180 2
13 150 180 3