Home > Net >  Modifying order of cases in dataframe R
Modifying order of cases in dataframe R

Time:10-09

I have a problem with some R Code in terms of sports data over a number of years not being in a very logical order. I have a dataset with 42 variables and almost 80,000 cases, and one is paraphrased below:

dat <- c(2020, 2020, 2020, 2020, 2020, 2020, 2020)
r<- c("QF", "R1", "R15", "R2", "R25", "R3", "SF")
data <- data.frame(dat, r)

Obiously each case will have one of the round details, not all of them, and not only having 26 cases

The problem is that rather than ordering it in the above order of R1-R25, followed by QF, SF and GF, it is ordered in a manner of GF, QF, R1, R10-R19, R2, R21-R25, R3-R9, SF, obviously due to the numerical order of the first digit after the R, and letter order of each thing.

This is how i want it to look, but I cant go through 80,000 cases manuall like this:

dat <- c(2020, 2020, 2020, 2020, 2020, 2020, 2020)
r <- c("R1", "R2", "R3", "R15", "R25", "R3", "QF", "SF")
data <- data.frame(dat, r)

Thanks :)

CodePudding user response:

Since you want "QF" and "SF" at the end one option would be to extract the number from the r column and order them. "QF" and "SF" don't have numeric value in them so they would return NA and will ordered last.

result <- data[order(as.numeric(stringr::str_extract(data$r, '\\d '))), ]

#   dat   r
#2 2020  R1
#4 2020  R2
#6 2020  R3
#3 2020 R15
#5 2020 R25
#1 2020  QF
#7 2020  SF

CodePudding user response:

Here's a tidyverse solution:

library(tidyverse)

data %>% 
  mutate(r = str_sort(r, numeric = T))

Edit:

To arrange as "R, Q, S", you can substring your r variable and apply a custom sort using arrange and match:

data %>% 
  mutate(r = str_sort(r, numeric = T)) %>% 
  arrange(match(str_sub(r,1,1), c("R", "Q", "S"))) 

This gives us:

   dat   r
1 2020  R1
2 2020  R2
3 2020  R3
4 2020 R15
5 2020 R25
6 2020  QF
7 2020  SF
  • Related