For example, I want to calculate perentile for this columns :
list_of_col <- total[ -c(2:8,16:21) ]
I know how I can calculate the percentile on one column (the B is the column by which it groups, for example B have index 1, and A is the column from list_of_col):
total<- total %>%
group_by(B) %>%
mutate(A = rank(A)/length(A))
I'm looking something like this, but I don't know what to put in place of X
total<- total %>%
group_by(B) %>%
mutate_at(list_of_col, X )
CodePudding user response:
Try something like this (I did not run it)
total<- total %>%
group_by(B) %>%
mutate_at(list_of_col, function(x) rank(x)/length(x))
CodePudding user response:
dplyr::mutate_at
is superseded. You can still use it but here is a more "modern" variant.
library("tidyverse")
set.seed(1234)
n <- 100
total <- tibble(
B = sample(letters, n, replace = TRUE),
X1 = rnorm(n),
X2 = rnorm(n),
X3 = rnorm(n)
)
total %>%
group_by(B) %>%
mutate(
across(X1:X3, percent_rank)
)
#> # A tibble: 100 × 4
#> # Groups: B [26]
#> B X1 X2 X3
#> <chr> <dbl> <dbl> <dbl>
#> 1 p 0 0.25 0.5
#> 2 z 0.25 0.5 0
#> 3 v 0.5 0.833 0.5
#> 4 e 0.667 1 1
#> 5 l 0 0 1
#> 6 o 0.25 0 1
#> 7 i 0.5 0 1
#> 8 e 1 0.333 0.667
#> 9 f 0.8 0.2 0.6
#> 10 p 0.25 1 0
#> # … with 90 more rows
Created on 2022-07-09 by the reprex package (v2.0.1)
Instead of your percentile function, I used dplyr::percent_rank
; with it, percentiles start at 0 rather than 1 / length(x).