I’m trying to evaluate the movements of a fish on the seabed and I would like to understand if it moves mainly during the night time or the day time. Thus, I have to define if the recorded displacements occured during the day or night time.
In order to do this I have created a dataframe that contains the measure of the covered distances and the date and time in which the displacement has been recorded.
Thanks to the ephem package I was able to define the hours when the sun sets and rises for every day of the research in the study location. Here’s how my dataframe looks like :
1.
Date Hours Covered distance sunrise sunset
12/08/2019 18:22 0.003000 05:17:15 19:53:17
12/08/2019 18:41 0.001895 05:17:15 19:53:17
12/08/2019 21:24 0.000291 05:17:15 19:53:17
12/08/2019 23:59 0.000490 05:17:15 19:53:17
13/08/2019 06:25 0.002394 05:19:33 19:52:16
13/08/2019 18:41 0.001895 05:19:33 19:52:16
13/08/2019 22:19 0.000291 05:19:33 19:52:16
14/08/2019 11:23 0.000490 05:20:19 19:50:14
14/08/2019 19:28 0.002394 05:20:19 19:50:14
......
From this dataframe I would like to create an additional column which specify if the hours of detection occured during the day time or night time:
Day/Night
0 Day
1 Day
2 Night
3 Night
4 Day
5 Day
6 Night
7 Day
8 Day
......
From my understanding I should create a loop that runs through all the elements of the Hour column defining if each recording took place before sunrise (night hours) or between sunrise and sunset (day hours) or after sunset (night hours). The tricky point is that the sunrise and sunset change across the days, as well as the lenght of days and nights ...
Once this step is done, I could calculate the total distance travelled during the day and night time.
I hope someone can elucidate me! Many many thanks
CodePudding user response:
You can convert the 'Hours' column and 'sunrise' and 'sunset' columns to datetime objects and then use if-else statements to check if the hour of recording is before sunrise, between sunrise and sunset, or after sunset.
You could use something like this:
for i, row in df.iterrows():
hour = datetime.strptime(row['Hours'], '%H:%M')
sunrise = datetime.strptime(row['sunrise'], '%H:%M:%S')
sunset = datetime.strptime(row['sunset'], '%H:%M:%S')
if hour < sunrise or hour > sunset:
df.at[i, 'Day/Night'] = 'Night'
else:
df.at[i, 'Day/Night'] = 'Day'
Once you have created the 'Day/Night' column, you can use the groupby() method to group the dataframe by 'Day/Night' and then use the sum() method to calculate the total distance travelled during the day and night time.
df_day_night = df.groupby(['Day/Night'])['Covered distance'].sum()
Please keep in mind that, since the time is in 24 hour format, you'll have to check the hours at the boundary of the day and night, as they could be considered either day or night depending on the date.
CodePudding user response:
pd.to_datetime()
method is used to convert the "Hours"
, "sunrise"
and "sunset"
columns to datetime objects, after that, boolean indexing using .loc()
method to create the "Day/Night"
column which is more efficient than using a loop.
Note: With the test case you provided, "sunrise"
and "sunset"
are in lower case
df['Hours'] = pd.to_datetime(df['Hours'], format='%H:%M')
df['sunrise'] = pd.to_datetime(df['sunrise'], format='%H:%M:%S')
df['sunset'] = pd.to_datetime(df['sunset'], format='%H:%M:%S')
df['Day/Night'] = "Night"
df.loc[(df['Hours'] > df['sunrise']) & (df['Hours'] < df['sunset']), 'Day/Night'] = "Day"
# Calculate distance
day_distance = df[df['Day/Night'] == 'Day']['Covered distance'].sum()
night_distance = df[df['Day/Night'] == 'Night']['Covered distance'].sum()
print(df)
print(f"Distance covered during day time: {day_distance}")
print(f"Distance covered during night time: {night_distance}")
Date Hours Covered distance sunrise sunset Day/Night
0 12/08/2019 1900-01-01 18:22:00 0.003000 1900-01-01 05:17:15 1900-01-01 19:53:17 Day
1 12/08/2019 1900-01-01 18:41:00 0.001895 1900-01-01 05:17:15 1900-01-01 19:53:17 Day
2 12/08/2019 1900-01-01 21:24:00 0.000291 1900-01-01 05:17:15 1900-01-01 19:53:17 Night
3 12/08/2019 1900-01-01 23:59:00 0.000490 1900-01-01 05:17:15 1900-01-01 19:53:17 Night
4 13/08/2019 1900-01-01 06:25:00 0.002394 1900-01-01 05:19:33 1900-01-01 19:52:16 Day
5 13/08/2019 1900-01-01 18:41:00 0.001895 1900-01-01 05:19:33 1900-01-01 19:52:16 Day
6 13/08/2019 1900-01-01 22:19:00 0.000291 1900-01-01 05:19:33 1900-01-01 19:52:16 Night
7 14/08/2019 1900-01-01 11:23:00 0.000490 1900-01-01 05:20:19 1900-01-01 19:50:14 Day
8 14/08/2019 1900-01-01 19:28:00 0.002394 1900-01-01 05:20:19 1900-01-01 19:50:14 Day
Distance covered during day time: 0.012068000000000002
Distance covered during night time: 0.001072