Home > Back-end >  Pandas: Use .resample() and .mean() with skipna=False
Pandas: Use .resample() and .mean() with skipna=False

Time:08-05

I'd like to bucket my dataframe by second and take the mean, whereby if there exist nan values in any bucket, I'd like to return nan.

Example:

import datetime as dt
import pandas as pd
idx = dt.datetime(2009,9,9,13,59,1,1), dt.datetime(2009,9,9,13,59,1,2), dt.datetime(2009,9,9,13,59,2,0)    
df = pd.DataFrame(index=[idx], data=[1,None,3])

>>> df
                                0
2009-09-09 13:59:01.000001 1.0000
2009-09-09 13:59:01.000002    nan
2009-09-09 13:59:02.000000 3.0000

And then

>>> df.resample("1S", label='right', closed='left').mean()
                         0
2009-09-09 13:59:02 1.0000
2009-09-09 13:59:03 3.0000
>>> 

However, what I want is

                         0
2009-09-09 13:59:02 nan
2009-09-09 13:59:03 3.0000
>>> 

I would assume ...mean(skipna=False) would give me the desired result, however, it returns an error:

pandas.errors.UnsupportedFunctionCall: numpy operations are not valid with resample. Use .resample(...).mean() instead

What's the solution here? .mean() on pandas typically accepts the skipna argument.

CodePudding user response:

Check Below code:

def custom_mean(val):
    return val.mean(skipna=False)

df.resample("1S", label='right', closed='left').agg({0:custom_mean})

Output:

enter image description here

CodePudding user response:

What you need is:

df.resample("1S").agg(pd.Series.mean, skipna=False)
  • Related