I'm writing a python code where I simulate satellite contact duration and date ranges for a given ground segment, and now I have these data frames for multiple (>=2) satellites in a constellation. So far I analyze the data for each of the individual satellites via pandas. What I'm trying to achieve is to merge overlapping date ranges of multiple satellites into a single resulting file. So taking an example of two satellites:
file 1:
Duration (s) Start Time (UTC) Stop Time (UTC)
450.61717646411466 2022-01-01 13:11:18.686564 2022-01-01 13:18:49.303741
272.9796195817538 2022-01-01 14:45:04.846243 2022-01-01 14:49:37.825862
file 2:
Duration (s) Start Time (UTC) Stop Time (UTC)
576.600683837155 2022-01-01 13:06:51.364924 2022-01-01 13:16:27.965608
568.5843137051123 2022-01-01 14:40:38.840363 2022-01-01 14:50:07.424677
The result I aim to achieve is having these date ranges and durations merged and fixed when overlapped into a single file, something like this:
Duration (s) Start Time (UTC) Stop Time (UTC)
718.600683837155 2022-01-01 13:06:51.364924 2022-01-01 13:18:49.303741
568.5843137051123 2022-01-01 14:40:38.840363 2022-01-01 14:50:07.424677
Does pandas (or any other library) have ready-to-go function to deal with this kind of problem? Otherwise, could anyone help me figuring this out?
Many thanks in advance.
CodePudding user response:
Thank you @MrFuppes and @NuLo for the feedbacks.
I ended up being able to sort it out myself: first I gathered all the inputs into a single file and sorted them based on the initial data. Then the second step was to merge them. The following is my piece of code as a function that now works, and I thought it could help others in the future:
import os
import pandas as pd
def mergeContactsIntoConstellation(originDIR, destinyDIR, station, noSatellites):
'''
Reads each of the individual satellites accesses for a given ground station,
and merge them into a single constellation contact.
:originDIR: directory containing the individual files
:destintyR: directory where the merged file is saved
:station: ground station of interest
:noSatellites: number of satellites in the constellation
'''
iDates = []
fDates = []
for i in range(0, noSatellites):
file = originDIR station.name '_sat{}_Contacts.txt'.format(i 1)
if os.stat(file).st_size == 0:
continue
file = pd.read_csv(originDIR station.name '_sat{}_Contacts.txt'.format(i 1),delimiter='\t',header=0,engine='python')
file.columns = ['duration', 'itime', 'ftime']
itime = file['itime']
ftime = file['ftime']
for iDate, fDate in zip(itime, ftime):
iDates.append(datetime.strptime(iDate, '%Y-%m-%d %H:%M:%S.%f'))
fDates.append(datetime.strptime(fDate, '%Y-%m-%d %H:%M:%S.%f'))
newiDates = sorted(iDates)
newfDates = []
for newiDate in newiDates:
for idate, fdate in zip(iDates, fDates):
if newiDate == idate:
newfDates.append(fdate)
Dates = [newiDates, newfDates]
resultingiDates = []
resultingfDates = []
for i in range(0, len(Dates[0])-1):
if Dates[1][0] >= Dates[0][i 1]:
Dates[0][0] = min(Dates[0][0], Dates[0][i 1])
Dates[1][0] = max(Dates[1][0], Dates[1][i 1])
else:
resultingiDates.append(Dates[0][0])
resultingfDates.append(Dates[1][0])
Dates[0][0] = Dates[0][i 1]
Dates[1][0] = Dates[1][i 1]
else:
resultingiDates.append(Dates[0][0])
resultingfDates.append(Dates[1][0])
jointContacts = pd.DataFrame()
for idate, fdate in zip(resultingiDates, resultingfDates):
jointContacts=jointContacts.append({
"Duration (s)": datetime_to_absolutedate(fdate).durationFrom(datetime_to_absolutedate(idate)),
"Start Time (UTC)": idate,
"Stop Time (UTC)":fdate}, ignore_index=True)
jointContacts.to_csv(destinyDIR station.name '_JointContacts.txt', sep='\t', index=False)
Some of the functions such as "durationFrom" come from a python wrapper of OREKIT.