I want to keep only the rows from column for the period between the <sowing_date> and <harvest_date> based on ID because every id has different sowing date and harvest date
ID time NDVI sowing_date harvesting_date
106 2020-03-01 0.307967 2020-04-21 2020-11-01
106 2020-03-02 0.299089 2020-04-21 2020-11-01
106 2020-03-03 0.290211 2020-04-21 2020-11-01
I tried through groupby but it doesn't work properly and I think only through a function or a for loop this can work. Please any thoughts?
The expected outcome should be like below
ID time NDVI sowing_date harvesting_date
106 2020-04-21 0.307967 2020-04-21 2020-11-01
106 2020-04-22 0.299089 2020-04-21 2020-11-01
...
106 2020-11-01 0.290211 2020-04-21 2020-11-01
CodePudding user response:
This is basically just a filter then. One common way of filtering a dataframe is to create a list of True/False for each line and then filter on that. This looks something like:
filter = [(df.time <= df.harvesting_date) & (df.time >= df.sowing_date)]
filtered_df = df[filter]
You could also do that in one line, but this makes it easier to see what "filter" is doing, if you are interested.
A word of caution though! Be sure those dates are datetime objects; dates often show as as strings, so you'd need to use something like strptime()
to change them.
Hope this helps!