Home > Blockchain >  Replace NAs with the outcome of an equation for each group
Replace NAs with the outcome of an equation for each group

Time:04-05

My data are as follows:

year site value dist 
2019 A    200   10
2019 B    NA    15
2019 C    50    30
2019 D    NA    36
2020 A    300   10
2020 B    NA    15
2020 C    100   30
2020 D    NA    36

I would like to replace the NA values for site B as follows.

Site B NA = the sum of the products of the following two equations:

        1) value of site A * (1/(dist of site B - dist of site A)) 

and

        2) value of site C * (1/(dist of site C - dist of site B))

I would like to do this for each year, ideally using dplyr. Thank you in advance!

CodePudding user response:

library(dplyr)

df %>%
  arrange(year, site) %>% 
  group_by(year) %>%
  mutate(value = ifelse(is.na(value) & site == "B", 
                        (lag(value) * (1/(dist - lag(dist))))   (lead(value)* (1/(lead(dist) - dist))), 
                        value))

Output

   year site  value  dist
  <int> <chr> <dbl> <int>
1  2019 A     200      10
2  2019 B      43.3    15
3  2019 C      50      30
4  2019 D      NA      36
5  2020 A     300      10
6  2020 B      66.7    15
7  2020 C     100      30
8  2020 D      NA      36

Data

df <- structure(list(year = c(2019L, 2019L, 2019L, 2019L, 2020L, 2020L, 
2020L, 2020L), site = c("A", "B", "C", "D", "A", "B", "C", "D"
), value = c(200L, NA, 50L, NA, 300L, NA, 100L, NA), dist = c(10L, 
15L, 30L, 36L, 10L, 15L, 30L, 36L)), class = "data.frame", row.names = c(NA, 
-8L))

CodePudding user response:

Update: Thanks to AndrewGB: wraped around parenthesis:

If I understand it correctly it should be like this?

library(dplyr)

df %>%
  group_by(year) %>% 
  mutate(value = ifelse(is.na(value) & site == "B", 
                        value[site=="A"]*(1/(dist[site=="B"] - dist[site=="A"]))   
                        value[site=="C"]*(1/(dist[site=="C"] - dist[site=="B"])), 
                        value)
  )
   year site  value  dist
  <int> <chr> <dbl> <int>
1  2019 A     200      10
2  2019 B      43.3    15
3  2019 C      50      30
4  2019 D      NA      36
5  2020 A     300      10
6  2020 B      66.7    15
7  2020 C     100      30
8  2020 D      NA      36
  •  Tags:  
  • r
  • Related