Home > Back-end >  Calculate the percentile each of columns
Calculate the percentile each of columns

Time:07-10

For example, I want to calculate perentile for this columns :

list_of_col <- total[ -c(2:8,16:21) ]

I know how I can calculate the percentile on one column (the B is the column by which it groups, for example B have index 1, and A is the column from list_of_col):

total<- total %>%
  group_by(B) %>%
  mutate(A = rank(A)/length(A))

I'm looking something like this, but I don't know what to put in place of X

total<- total %>%
  group_by(B) %>%
  mutate_at(list_of_col, X )

CodePudding user response:

Try something like this (I did not run it)

total<- total %>%
  group_by(B) %>%
  mutate_at(list_of_col, function(x) rank(x)/length(x))

CodePudding user response:

dplyr::mutate_at is superseded. You can still use it but here is a more "modern" variant.

library("tidyverse")

set.seed(1234)

n <- 100

total <- tibble(
  B = sample(letters, n, replace = TRUE),
  X1 = rnorm(n),
  X2 = rnorm(n),
  X3 = rnorm(n)
)

total %>%
  group_by(B) %>%
  mutate(
    across(X1:X3, percent_rank)
  )
#> # A tibble: 100 × 4
#> # Groups:   B [26]
#>    B        X1    X2    X3
#>    <chr> <dbl> <dbl> <dbl>
#>  1 p     0     0.25  0.5  
#>  2 z     0.25  0.5   0    
#>  3 v     0.5   0.833 0.5  
#>  4 e     0.667 1     1    
#>  5 l     0     0     1    
#>  6 o     0.25  0     1    
#>  7 i     0.5   0     1    
#>  8 e     1     0.333 0.667
#>  9 f     0.8   0.2   0.6  
#> 10 p     0.25  1     0    
#> # … with 90 more rows

Created on 2022-07-09 by the reprex package (v2.0.1)

Instead of your percentile function, I used dplyr::percent_rank; with it, percentiles start at 0 rather than 1 / length(x).

  • Related