Home > Mobile >  Calculate a column based on several values in different column
Calculate a column based on several values in different column

Time:05-12

My pandas df looks like something like this:

company ID  quarter     metric
  12       31-12-2019    54.3     
  12       30-09-2019    48.2
  12       30-06-2019    32.3
  12       30-03-2019    54.3
  23       31-12-2018    54.3
  23       30-09-2018    48.2
  23       30-06-2018    32.3
  23       30-03-2018    54.3
  45       31-12-2021    54.3
  45       30-09-2021    48.2
  45       30-06-2021    32.3
  45       30-03-2021    54.3
  45       31-12-2021    54.3
  45       30-09-2020    48.2
  45       30-06-2020    32.3
  45       30-03-2020    54.3
  ..           ..         ..

For each quarter row for each company ID I need to compute an annual value from the following quarters i.e for company ID = 45 and quarter = 30-06-2020 annual value would be equal to:

30-03-2021    54.3
31-12-2020    54.3
30-09-2020    48.2
30-06-2020    32.3
            --------
              189,1

Example result for one of the rows:

    company ID  quarter     metric   annual
      12       31-12-2019    54.3     
      12       30-09-2019    48.2
      12       30-06-2019    32.3
      12       30-03-2019    54.3
      23       31-12-2018    54.3
      23       30-09-2018    48.2
      23       30-06-2018    32.3
      23       30-03-2018    54.3
      45       31-12-2021    54.3
      45       30-09-2021    48.2
      45       30-06-2021    32.3
      45       30-03-2021    54.3
      45       31-12-2021    54.3
      45       30-09-2020    48.2
      45       30-06-2020    32.3   **189,1**
      45       30-03-2020    54.3
      ..           ..         ..

This df is quite big so have been wondering what would be an effective solution.

CodePudding user response:

The function df['annual'] = df[::-1].rolling(4).sum()[1] would be able to solve your problem.

The function creates calculates the sum of the number of columns of each variable in the dataframe from top to bottom, hence the [::-1]. The function outputs a dataframe so the last column is selected and added back to the original dataframe.

Hope this helps.

  • Related