Home > Net >  asfreq in pandas returns an empty dataframe
asfreq in pandas returns an empty dataframe

Time:09-08

I'm tring to use infer_freq and asfreq and the return data frame is empty

Original data set:

    month   interest
0   2004-01 13
1   2004-02 15
2   2004-03 17
3   2004-04 19
4   2004-05 22

Trying to convert the data with different frequency

ice_cream_interest = pd.read_csv('ice_cream_interest.csv')
ice_cream_interest.set_index('month', inplace=True)
ice_cream_interest = ice_cream_interest.asfreq(pd.infer_freq(ice_cream_interest.index))
    interest
month   
2004-01-01  NaN
2004-02-01  NaN
2004-03-01  NaN
2004-04-01  NaN
2004-05-01  NaN
... ...
2020-04-01  NaN
2020-05-01  NaN
2020-06-01  NaN
2020-07-01  NaN
2020-08-01  NaN

CodePudding user response:

I want to quote from the documentation of asfreq().

The values corresponding to any timesteps in the new index which were not present in the original index will be null (NaN).

The problem is, that in your example the dates in the index are still strins and the given datetime object by the asfreq()-method do not exist yet. You can solve this, transforming your "months" to datetime-objects first.

Please see the example below:

import pandas as pd
df = pd.DataFrame({
    'month': ['2004-01', '2004-02', '2004-03', '2004-04', '2004-05'],
    'interest': [13, 15,17, 19, 22]
})

If we transform the months into a datetime object first, we get

df['month'] = pd.to_datetime(df['month'])
df = df.set_index('month')
print(df.index.values) # this is a list of datetime objects
>>> ['2004-01-01T00:00:00.000000000' '2004-02-01T00:00:00.000000000'
 '2004-03-01T00:00:00.000000000' '2004-04-01T00:00:00.000000000'
 '2004-05-01T00:00:00.000000000']
df = df.asfreq(pd.infer_freq(df.index))
df
>>>
            interest
month               
2004-01-01        13
2004-02-01        15
2004-03-01        17
2004-04-01        19
2004-05-01        22

Doing the same without this transformation

# df['month'] = pd.to_datetime(df['month'])
df = df.set_index('month')
print(df.index.values)  # this is still a list strings
>>> ['2004-01' '2004-02' '2004-03' '2004-04' '2004-05']

df = df.asfreq(pd.infer_freq(df.index))
df
>>>
            interest
month               
2004-01-01       NaN
2004-02-01       NaN
2004-03-01       NaN
2004-04-01       NaN
2004-05-01       NaN

CodePudding user response:

Given:

     month  interest
0  2004-01        13
1  2004-02        15
2  2004-03        17
3  2004-04        19
4  2004-05        22

Doing:

# Convert to datetime
df.month = pd.to_datetime(df.month)

# Set Index
df = df.set_index('month')

# Convert to freq
df = df.asfreq(pd.infer_freq(df.index))

Output:

>>> df
            interest
month
2004-01-01        13
2004-02-01        15
2004-03-01        17
2004-04-01        19
2004-05-01        22

>>> df.index
DatetimeIndex(['2004-01-01', '2004-02-01', '2004-03-01', '2004-04-01',
               '2004-05-01'],
              dtype='datetime64[ns]', name='month', freq='MS')

We can see it successfully converted to a frequency index.

  • Related