I've seen many posts related to arrange()
issues, but none of them solved my situation, hopefully, this is not a duplicate. I have some columns named "Q1", "Q2", "Q3" and so on. After calculating some basic descriptive stats with rstatix::get_summary_stats()
, I need to arrange the new column variable
in ascending order (ie, Q1 before Q2 before Q3, etc). I'm sure this is a silly problem, but I can't see what I'm doing wrong.
- the raw data looks like this:
ID Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12 Q13 Q14 Q15
1 PART1 4 1 1 5 5 5 1 5 1 1 3 5 5 1 5
2 PART2 5 4 1 5 5 4 1 5 2 1 3 5 4 1 5
3 PART3 2 4 3 5 5 4 1 5 2 1 3 5 4 1 5
so on...
- My attempt:
descriptive <- data %>%
rstatix::get_summary_stats(show = c("mean", "sd", "median", "iqr", "min", "max")) %>%
mutate_if(is.numeric, round, 2) %>%
dplyr::arrange(variable)
- The first 10 lines:
A tibble: 15 x 8
variable n mean sd median iqr min max
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Q1 63 3.94 1.03 4 2 2 5
2 Q10 63 1.84 0.88 2 2 1 3
3 Q11 63 2.62 1.31 3 3 1 5
4 Q12 63 3.98 1.01 4 2 2 5
5 Q13 63 4.33 0.8 5 1 2 5
6 Q14 63 1.91 0.88 2 2 1 4
7 Q15 63 4.25 0.95 5 1 2 5
8 Q2 63 2.86 1.58 3 3 1 5
9 Q3 63 1.97 1.06 2 2 1 4
10 Q4 63 3.98 1.04 4 2 2 5
Note: I've already tried ungroup()
and across(starts_with("Q*")))
, but nothing works. Any thoughts would be much appreciated, thanks in adv.
- data:
> dput(descriptive)[1:10, ]
structure(list(variable = c("Q1", "Q10", "Q11", "Q12", "Q13",
"Q14", "Q15", "Q2", "Q3", "Q4", "Q5", "Q6", "Q7", "Q8", "Q9"),
n = c(63, 63, 63, 63, 63, 63, 63, 63, 63, 63, 63, 63, 63,
63, 63), mean = c(3.94, 1.84, 2.62, 3.98, 4.33, 1.91, 4.25,
2.86, 1.97, 3.98, 4.21, 4.05, 2.38, 4.03, 2.25), sd = c(1.03,
0.88, 1.31, 1.01, 0.8, 0.88, 0.95, 1.58, 1.06, 1.04, 0.94,
1.04, 1.36, 1.05, 1.12), median = c(4, 2, 3, 4, 5, 2, 5,
3, 2, 4, 4, 4, 2, 4, 2), iqr = c(2, 2, 3, 2, 1, 2, 1, 3,
2, 2, 1, 2, 2.5, 2, 2), min = c(2, 1, 1, 2, 2, 1, 2, 1, 1,
2, 2, 1, 1, 2, 1), max = c(5, 3, 5, 5, 5, 4, 5, 5, 4, 5,
5, 5, 5, 5, 5)), row.names = c(NA, -15L), class = c("tbl_df",
"tbl", "data.frame"))
CodePudding user response:
How about just use arrange()
on the integer part of variable
?
descriptive %>% arrange(as.integer(gsub("Q","",variable)))
Output:
# A tibble: 15 × 8
variable n mean sd median iqr min max
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Q1 63 3.94 1.03 4 2 2 5
2 Q2 63 2.86 1.58 3 3 1 5
3 Q3 63 1.97 1.06 2 2 1 4
4 Q4 63 3.98 1.04 4 2 2 5
5 Q5 63 4.21 0.94 4 1 2 5
6 Q6 63 4.05 1.04 4 2 1 5
7 Q7 63 2.38 1.36 2 2.5 1 5
8 Q8 63 4.03 1.05 4 2 2 5
9 Q9 63 2.25 1.12 2 2 1 5
10 Q10 63 1.84 0.88 2 2 1 3
11 Q11 63 2.62 1.31 3 3 1 5
12 Q12 63 3.98 1.01 4 2 2 5
13 Q13 63 4.33 0.8 5 1 2 5
14 Q14 63 1.91 0.88 2 2 1 4
15 Q15 63 4.25 0.95 5 1 2 5
CodePudding user response:
We could use mixedorder
which would work even if the values have different prefix
library(dplyr)
descriptive %>%
arrange(order(gtools::mixedorder(variable)))
-output
# A tibble: 15 × 8
variable n mean sd median iqr min max
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Q1 63 3.94 1.03 4 2 2 5
2 Q2 63 2.86 1.58 3 3 1 5
3 Q3 63 1.97 1.06 2 2 1 4
4 Q4 63 3.98 1.04 4 2 2 5
5 Q5 63 4.21 0.94 4 1 2 5
6 Q6 63 4.05 1.04 4 2 1 5
7 Q7 63 2.38 1.36 2 2.5 1 5
8 Q8 63 4.03 1.05 4 2 2 5
9 Q9 63 2.25 1.12 2 2 1 5
10 Q10 63 1.84 0.88 2 2 1 3
11 Q11 63 2.62 1.31 3 3 1 5
12 Q12 63 3.98 1.01 4 2 2 5
13 Q13 63 4.33 0.8 5 1 2 5
14 Q14 63 1.91 0.88 2 2 1 4
15 Q15 63 4.25 0.95 5 1 2 5
Or with parse_number
descriptive %>%
arrange(readr::parse_number(variable))
CodePudding user response:
There are already better soultions. Just for fun:
We could split variable
column with reprex (?<=[A-Za-z])(?=[0-9])
and then arrange:
library(tidyr)
library(dplyr)
df %>%
separate(variable, c("quarter", "number"), sep = "(?<=[A-Za-z])(?=[0-9])", remove = FALSE) %>%
arrange(quarter, as.numeric(number)) %>%
select(-c(quarter, number))
variable n mean sd median iqr min max
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Q1 63 3.94 1.03 4 2 2 5
2 Q2 63 2.86 1.58 3 3 1 5
3 Q3 63 1.97 1.06 2 2 1 4
4 Q4 63 3.98 1.04 4 2 2 5
5 Q5 63 4.21 0.94 4 1 2 5
6 Q6 63 4.05 1.04 4 2 1 5
7 Q7 63 2.38 1.36 2 2.5 1 5
8 Q8 63 4.03 1.05 4 2 2 5
9 Q9 63 2.25 1.12 2 2 1 5
10 Q10 63 1.84 0.88 2 2 1 3
11 Q11 63 2.62 1.31 3 3 1 5
12 Q12 63 3.98 1.01 4 2 2 5
13 Q13 63 4.33 0.8 5 1 2 5
14 Q14 63 1.91 0.88 2 2 1 4
15 Q15 63 4.25 0.95 5 1 2 5