Home > database >  na.approx with multiple specific columns
na.approx with multiple specific columns

Time:04-28

I have a dataframe (DF) with many columns (colA, colB, colC, colD, . . . . ). I would like to apply the na.approx function, with group_by, to several, but not all, columns in the dataframe. I succeeded in applying the na.approx and group_by functions on one column with the following:

DFxna<-DF %>% group_by(colA) %>% mutate(colB = na.approx(colB, na.rm = FALSE, maxgap=4))

However, I was not able to create a code that would apply to several, specified, columns. I thought that lapply would be appropriate, and tried several times, unsuccesfully, to use lapply.

CodePudding user response:

Maybe this fits your need. As I mentioned in my comment one option would be to use dplyr::across.

Using some fake data:

library(zoo)
library(dplyr)

DF <- data.frame(
  colA = c(1, 1, 1, 2, 2, 2, 2),
  colB = c(1, NA, 3, 5, NA, NA, 6),
  colC = c(1, NA, 2, 8, NA, 9, 6),
  colD = c(1, NA, 3, 5, NA, NA, 6)
)

DF %>% 
  group_by(colA) %>% 
  mutate(across(c(colB, colC), ~ na.approx(.x, na.rm = FALSE, maxgap=4)))
#> # A tibble: 7 × 4
#> # Groups:   colA [2]
#>    colA  colB  colC  colD
#>   <dbl> <dbl> <dbl> <dbl>
#> 1     1  1      1       1
#> 2     1  2      1.5    NA
#> 3     1  3      2       3
#> 4     2  5      8       5
#> 5     2  5.33   8.5    NA
#> 6     2  5.67   9      NA
#> 7     2  6      6       6
  • Related