Home > Blockchain >  Apply dplyr::starts_with() with lambda function
Apply dplyr::starts_with() with lambda function

Time:12-19

I have below implementation

library(dplyr)
library(tidyr)
dat = data.frame('A' = 1:3, 'C_1' = 1:3, 'C_2' = 1:3, 'M' = 1:3)

Below works

dat %>% rowwise %>% mutate(Anew = list({function(x) c(x[1]^2, x[2]   5, x[3]   1)}(c(M, C_1, C_2)))) %>% ungroup %>% unnest_wider(Anew, names_sep = "")

However below does not work when I try find the column names using dplyr::starts_with()

dat %>% rowwise %>% mutate(Anew = list({function(x) c(x[1]^2, x[2]   5, x[3]   1)}(c(M, starts_with('C_'))))) %>% ungroup %>% unnest_wider(Anew, names_sep = "")

Any pointer on how to correctly apply starts_with() in this context will be very helpful.

PS : This is continuation from my earlier post Apply custom function that returns multiple values after dplyr::rowwise()

CodePudding user response:

starts_with must be used within a selecting function so:

dat %>%
  rowwise %>%
  mutate(Anew = list({function(x) c(x[1]^2, x[2]   5, x[3]   1)}
    (select(cur_data(), M, starts_with('C_'))))) %>%
  ungroup %>%
  unnest_wider(Anew, names_sep = "")
## # A tibble: 3 × 7
##       A   C_1   C_2     M AnewM AnewC_1 AnewC_2
##   <int> <int> <int> <int> <dbl>   <dbl>   <dbl>
## 1     1     1     1     1     1       6       2
## 2     2     2     2     2     4       7       3
## 3     3     3     3     3     9       8       4

Here group_modify would also work and allow the use of formula notation to specify an anonymous function:

dat %>%
  group_by(A) %>%
  group_modify(~ cbind(.x, Anew = c(.x[1]^2, .x[2]   5, .x[3]   1))) %>%
  ungroup
## # A tibble: 3 × 7
##       A   C_1   C_2     M Anew.C_1 Anew.C_2 Anew.M
##   <int> <int> <int> <int>    <dbl>    <dbl>  <dbl>
## 1     1     1     1     1        1        6      2
## 2     2     2     2     2        4        7      3
## 3     3     3     3     3        9        8      4

CodePudding user response:

If we wrap the starts_with in c_across and assuming there is a third column that starts with C_, then the lambda function on the fly would work

library(dplyr)
library(tidyr)
dat %>%
  rowwise %>%
   mutate(Anew = list((function(x) c(x[1]^2, x[2]   5, x[3]   
      1))(c_across(starts_with("C_"))))) %>%
  unnest_wider(Anew, names_sep = "")

-output

# A tibble: 3 × 8
      A   C_1   C_2   C_3     M Anew1 Anew2 Anew3
  <int> <int> <int> <int> <int> <dbl> <dbl> <dbl>
1     1     1     1     1     1     1     6     2
2     2     2     2     2     2     4     7     3
3     3     3     3     3     3     9     8     4

Or instead of doing rowwise, we may create a named list of functions and apply column wise with across (would be more efficient)

fns <- list(C_1 = function(x) x^2, C_2 = function(x) x   5, 
      C_3 = function(x) x   1)
dat %>%
   mutate(across(starts_with("C_"), 
    ~ fns[[cur_column()]](.x), .names = "Anew{seq_along(.fn)}"))

-output

   A C_1 C_2 C_3 M Anew1 Anew2 Anew3
1 1   1   1   1 1     1     6     2
2 2   2   2   2 2     4     7     3
3 3   3   3   3 3     9     8     4

data

dat <- data.frame('A' = 1:3, 'C_1' = 1:3, 'C_2' = 1:3, C_3 = 1:3, 'M' = 1:3)
  • Related