Let's imagine I have a timeseries dataframe of temperature sensor data that goes by 30 min intervals. How do I basically subset each 30 min interval into smaller 5 min intervals while accounting for the difference of temperature drops between each interval?
I imagine that doing something like this could work:
30 min intervals:
interval 1: temp = 30 interval 2: temp = 25
5 min intervals:
interval 1: temp = 30 interval 2: temp = 29 interval 3: temp = 28 interval 4: temp = 27 interval 5: temp = 26 interval 6: temp = 25
CodePudding user response:
I would do it with a resample of the data frame to a lower time resolution ("6T" in this case, with T meaning minutes), this will create new rows for missing time steps with nan values then you can fill those nan somehow, for what you describe I think a linear interpolation can be enough.
Here you have a simple example that I think can match the data you describe.
import pandas as pd
df = pd.DataFrame({"temp":[30, 25, 20, 18]}, index = pd.date_range("2021-12-01 12:00:00", "2021-12-01 13:59:00", freq = "30T"))
#This resample will preserve your values at their original time indexes, and will create new rows for the intermediate
#datetime full of nans
#the .last() is just used to select the value for each time-step, you could also use mean o max o min or mean as there is just one value for each time step so it would get you the same.
df = df.resample("6T").last()
#it really depends on how you want to implement the change over time of the data, but as you described a linear
#variation, what you can use is a simple linear interpolation between values with the method interpolate
df.interpolate()
CodePudding user response:
It would be helpful if you could include an example of two things:
- What does your data look like currently?
- What would you like it too look like after you are done?
Otherwise it isn't clear what you're asking or what the problem is, so it's harder to help!