I have a simple DataFrame:
start end
0 30 40
1 45 55
2 50 60
3 53 64
4 65 70
5 75 80
6 77 85
7 80 83
8 90 120
9 95 100
10 105 110
You may notice some rows are part of another row, or they overlap with them. I want to straighten up this DataFrame to get this:
start end
0 30 40
1 45 64
2 65 70
3 75 85
4 90 120
I drew a picture for a better explanation (hope, it helps):
CodePudding user response:
Use custom function with Dataframe constructor:
#https://stackoverflow.com/a/5679899/2901002
def merge(times):
saved = list(times[0])
for st, en in sorted([sorted(t) for t in times]):
if st <= saved[1]:
saved[1] = max(saved[1], en)
else:
yield tuple(saved)
saved[0] = st
saved[1] = en
yield tuple(saved)
df1 = pd.DataFrame(merge(df[['start','end']].to_numpy()), columns=['start','end'])
print (df1)
start end
0 30 40
1 45 64
2 65 70
3 75 85
4 90 120