Home > Software design >  Apply custom function to grouped or selected columns
Apply custom function to grouped or selected columns

Time:11-08

I'd like to recode certain items in my data frame, those which contain an even number (N2, N4, N6, E2, etc.) For each value of these selected columns I'd like to apply abs(x-6) (see my function). Then I need an additional 2 columns with the means of items of the same category: mean(N) and mean(E) for each row.

Example code:

df1 <- tibble(id = 1:5, 
          N1 = c(4,3,2,5,4),
          N2 = c(1,1,3,2,5),
          N3 = c(5,5,2,4,3),
          N4 = c(4,2,2,2,1), 
          N5 = c(1,1,4,2,3),
          N6 = c(5,2,4,3,1),
          E1 = c(1,2,3,1,1),
          E2 = c(5,2,3,1,1), 
          E3 = c(2,2,1,3,1),
          E4 = c(1,1,1,3,2), 
          E5 = c(2,3,1,4,4), 
          E6 = c(3,2,3,3,1))

My function:

recode_items <- function(reverse_items) {
  items <- abs(reverse_items - 6)
  return(items)
}

E.g.

recode_items(c(5,2,3,1,1))
[1] 1 4 3 5 5

My code:

recoded_df1 <- df1 |>
  group_by(ends_with(c("2","4","6"))) |>
  group_modify(~ recode_items(.x)) |>
  ungroup() |>
  mutate(N = mean(N1:N6),
         E = mean(E1:E6))

My code doesn't work, I get error messages for this line: group_by(ends_with(c("2","4","6"))). I tried many variants, including filter(), select(), select_at() etc.

Thanks for your help!

CodePudding user response:

library(dplyr)
df1 %>% 
  mutate(across(matches("(2|4|6)"), recode_items), N =  rowMeans(across(N1:N6)), 
          E = rowMeans(across(E1:E6)))

-output

# A tibble: 5 × 15
     id    N1    N2    N3    N4    N5    N6    E1    E2    E3    E4    E5    E6     N     E
  <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1     1     4     5     5     2     1     1     1     1     2     5     2     3  3     2.33
2     2     3     5     5     4     1     4     2     4     2     5     3     4  3.67  3.33
3     3     2     3     2     4     4     2     3     3     1     5     1     3  2.83  2.67
4     4     5     4     4     4     2     3     1     5     3     3     4     3  3.67  3.17
5     5     4     1     3     5     3     5     1     5     1     4     4     5  3.5   3.33

CodePudding user response:

Here is a base R solution. The key here is to recycle a logical vector to get all the even columns. Note that tibbles don't work with the recycling, so I convert to a traditional dataframe first.

#convert from tibble
df2 <- as.data.frame(df1)

#apply function to all even columns
df2[,-1][,c(FALSE, TRUE)] <- abs(df2[,-1][,c(FALSE, TRUE)]-6)

#calculate row means per group
df2$N <- rowMeans(df2[,grepl("N", colnames(df2))])
df2$E <- rowMeans(df2[,grepl("E", colnames(df2))])

df2
#>   id N1 N2 N3 N4 N5 N6 E1 E2 E3 E4 E5 E6        N        E
#> 1  1  4  5  5  2  1  1  1  1  2  5  2  3 3.000000 2.333333
#> 2  2  3  5  5  4  1  4  2  4  2  5  3  4 3.666667 3.333333
#> 3  3  2  3  2  4  4  2  3  3  1  5  1  3 2.833333 2.666667
#> 4  4  5  4  4  4  2  3  1  5  3  3  4  3 3.666667 3.166667
#> 5  5  4  1  3  5  3  5  1  5  1  4  4  5 3.500000 3.333333
  •  Tags:  
  • r
  • Related