Union of date ranges in Python-CodePudding

I'm writing a python code where I simulate satellite contact duration and date ranges for a given ground segment, and now I have these data frames for multiple (>=2) satellites in a constellation. So far I analyze the data for each of the individual satellites via pandas. What I'm trying to achieve is to merge overlapping date ranges of multiple satellites into a single resulting file. So taking an example of two satellites:

file 1:

Duration (s)    Start Time (UTC)    Stop Time (UTC)
450.61717646411466  2022-01-01 13:11:18.686564  2022-01-01 13:18:49.303741
272.9796195817538   2022-01-01 14:45:04.846243  2022-01-01 14:49:37.825862

file 2:

Duration (s)    Start Time (UTC)    Stop Time (UTC)
576.600683837155    2022-01-01 13:06:51.364924  2022-01-01 13:16:27.965608
568.5843137051123   2022-01-01 14:40:38.840363  2022-01-01 14:50:07.424677

The result I aim to achieve is having these date ranges and durations merged and fixed when overlapped into a single file, something like this:

Duration (s)    Start Time (UTC)    Stop Time (UTC)
718.600683837155    2022-01-01 13:06:51.364924  2022-01-01 13:18:49.303741
568.5843137051123   2022-01-01 14:40:38.840363  2022-01-01 14:50:07.424677

Does pandas (or any other library) have ready-to-go function to deal with this kind of problem? Otherwise, could anyone help me figuring this out?

Many thanks in advance.

CodePudding user response：

Thank you @MrFuppes and @NuLo for the feedbacks.

I ended up being able to sort it out myself: first I gathered all the inputs into a single file and sorted them based on the initial data. Then the second step was to merge them. The following is my piece of code as a function that now works, and I thought it could help others in the future:

import os
import pandas as pd
def mergeContactsIntoConstellation(originDIR, destinyDIR, station, noSatellites):
'''
Reads each of the individual satellites accesses for a given ground station,
and merge them into a single constellation contact.
:originDIR: directory containing the individual files
:destintyR: directory where the merged file is saved
:station: ground station of interest
:noSatellites: number of satellites in the constellation
'''
iDates = []
fDates = []
for i in range(0, noSatellites):
    file = originDIR   station.name '_sat{}_Contacts.txt'.format(i 1)
    if os.stat(file).st_size == 0:
        continue
    file = pd.read_csv(originDIR   station.name '_sat{}_Contacts.txt'.format(i 1),delimiter='\t',header=0,engine='python')
    file.columns = ['duration', 'itime', 'ftime']
    itime = file['itime']
    ftime = file['ftime']

    for iDate, fDate in zip(itime, ftime):
        iDates.append(datetime.strptime(iDate, '%Y-%m-%d %H:%M:%S.%f'))
        fDates.append(datetime.strptime(fDate, '%Y-%m-%d %H:%M:%S.%f'))

newiDates = sorted(iDates)
newfDates = []
for newiDate in newiDates:
    for idate, fdate in zip(iDates, fDates):
        if newiDate == idate:
            newfDates.append(fdate)
Dates = [newiDates, newfDates]

resultingiDates = []
resultingfDates = []
for i in range(0, len(Dates[0])-1):
    if Dates[1][0] >= Dates[0][i 1]:
        Dates[0][0] = min(Dates[0][0], Dates[0][i 1])
        Dates[1][0] = max(Dates[1][0], Dates[1][i 1])
    else:
        resultingiDates.append(Dates[0][0])
        resultingfDates.append(Dates[1][0])
        Dates[0][0] = Dates[0][i 1]
        Dates[1][0] = Dates[1][i 1]
else:
    resultingiDates.append(Dates[0][0])
    resultingfDates.append(Dates[1][0])

jointContacts = pd.DataFrame()
for idate, fdate in zip(resultingiDates, resultingfDates):
    jointContacts=jointContacts.append({
        "Duration (s)": datetime_to_absolutedate(fdate).durationFrom(datetime_to_absolutedate(idate)),
        "Start Time (UTC)": idate,
        "Stop Time (UTC)":fdate}, ignore_index=True)
jointContacts.to_csv(destinyDIR station.name '_JointContacts.txt', sep='\t', index=False)

Some of the functions such as "durationFrom" come from a python wrapper of OREKIT.