I have a pandas dataframe in which the index is the timestamp and I have a column that contains a value per timestamp, like this:
Values | |
---|---|
timestamp | |
2022-03-17 13:21:00 00:00 |
15.2 |
2022-03-22 13:24:00 00:00 |
17.8 |
2022-03-27 13:27:00 00:00 |
NaN |
2022-03-30 13:30:00 00:00 |
NaN |
In the column of Values sometimes I get a number and other times I get NaN.
What I am trying to do is to get a new dataframe that contains the values of the last week, for which I am using the next piece of code:
dataW=data.loc[(pd.Timestamp.utcnow()-pd.Timedelta(days=7)):(pd.Timestamp.utcnow())]
Which works fine, except if by coincidence the data of the last week is all NaNs: then I get an error. To solve this, I would like dataW to be a dataframe containing the data of the past seven days from the last day in which the Values is not a NaN. That means that, in the dataframe I wrote as example, instead of getting the data of
2022-03-30 13:30:00 00:00 - 7 days
I would like to get the data of
2022-03-22 13:24:00 00:00 - 7 days
Does anybody have an idea of how I could do this?
Thank you very much in advance,
CodePudding user response:
You can use last_valid_index
:
last = data['Values'].last_valid_index()
# or to consider all columns
# last = data.last_valid_index()
data.loc[last-pd.Timedelta(days=7):last]
output:
Values
timestamp
2022-03-17 13:21:00 00:00 15.2
2022-03-22 13:24:00 00:00 17.8
last
: Timestamp('2022-03-22 13:24:00 0000', tz='UTC')