Home > Software engineering >  Pandas: Working with missing values in monthly data
Pandas: Working with missing values in monthly data

Time:10-06

I want to work with a df with monthly data. Important: The data aggregates over the months. Looking like this:

Month   | Sum
01-2021    1
02-2021    4
03-2021    9
04-2021    11
05-2021    15
06-2021    16
08-2021    21
09-2021    25

Now I have two task:

  1. Detect if a month is missing and insert the month. In this example July 2021 (07-2021) is missing. In this case I want to create a new row with the month and use the previous sum (as this month does not contribute anything). But The df has to be complete without missing months.

  2. Quickly change between monthly and quarterly display. Transforming in:

Month   | Sum
Q1-2021    9
Q2-2021    16
Q3-2021    25

What I have already done: Converted the month column into datetime type as I think this might be the first step to the best solution here.

Thank you very much!

CodePudding user response:

I would work with the resample function. If Month is a column of the dataframe you should convert it into an index (df.set_index('Month')) and then resample, either on a monthly or quarterly basis:

ts.resample('M').bfill()

or

ts.resample('Q').bfill()

or one after the other. bfill fills the missing values based on the previous available record(s).

  • Related