Home > Software design >  how to create a new column where the rows are determined by the previous row (calculation)?
how to create a new column where the rows are determined by the previous row (calculation)?

Time:06-29

I have this table and I want to calculate inflation for every month in every city

I tried the lag and lead functions but the CPI was overlapping onto the next city for ex : Malmo's shifted CPI was Stockholm's 05-01-2005 CPI

CPI Date City
112 2005-01-01 Stockholm
113.5 2005-02-01 Stockholm
115 2005-03-01 Stockholm
115.6 2005-04-01 Stockholm
115.8 2005-05-01 Stockholm
106 2005-01-01 Malmo
107.5 2005--02-01 Malmo
110 2005-03-01 Malmo
113 2005-04-01 Malmo
117 2005-05-01 Malmo

CodePudding user response:

You need to group_by to avoid this issue.

# Read in data
inflation  <- read.table(text = "CPI    Date    City
112 2005-01-01  Stockholm
113.5   2005-02-01  Stockholm
115 2005-03-01  Stockholm
115.6   2005-04-01  Stockholm
115.8   2005-05-01  Stockholm
106 2005-01-01  Malmo
107.5   2005--02-01 Malmo
110 2005-03-01  Malmo
113 2005-04-01  Malmo
117 2005-05-01  Malmo", h = T)

# Perform calculation
library(dplyr)

inflation  |>
    group_by(City)  |>
    mutate(
        cpi_change = lead(CPI) - CPI,
        cpi_change_percent = cpi_change / CPI * 100
    )

Output:

# A tibble: 10 x 5
# # Groups:   City [2]
#      CPI Date        City      cpi_change cpi_change_percent
#    <dbl> <chr>       <chr>          <dbl>              <dbl>
#  1  112  2005-01-01  Stockholm      1.5                1.34
#  2  114. 2005-02-01  Stockholm      1.5                1.32
#  3  115  2005-03-01  Stockholm      0.600              0.522
#  4  116. 2005-04-01  Stockholm      0.200              0.173
#  5  116. 2005-05-01  Stockholm     NA                 NA
#  6  106  2005-01-01  Malmo          1.5                1.42
#  7  108. 2005--02-01 Malmo          2.5                2.33
#  8  110  2005-03-01  Malmo          3                  2.73
#  9  113  2005-04-01  Malmo          4                  3.54
# 10  117  2005-05-01  Malmo         NA                 NA

You will get NAs for the last month as we do not know the rate in the following year. Alternatively you can do it with lag instead of lead if you want to work out change from previous, but then you'll get NAs for the first month.

  • Related