I would like to relabel the levels of factors as follows:
i. If the level is of length 3 and above relabel it to sentence case otherwise do title case
ie home doing nothing
becomes Home doing nothing
, yes
becomes Yes
and good practice
becomes Good Practice
Is there any way to do this?
library(tidyverse)
vars <- c("a", "b", "c", "d", "e")
mydata <- tribble(
~"a", ~"b", ~"c", ~"d", ~"e", ~"id",
"yes", "school in Kenya", "r", 10, "good practice", 1,
"no", "home doing nothing", "python", 12, "doing well", 3,
"no", "school in Tanzania", "c ", 35, "by walking", 4,
"yes", "home practising", "c", 65, "practising everyday", 5,
"no", "home", "java", 78, "sitting alone", 7
) %>%
mutate(across(.cols = vars, ~as_factor(.)))
# mydata %>%
# mutate(across(where(is.factor), ~fct_relabel(., str_to_sentence(.))))
CodePudding user response:
Here is one possible solution. Note that the columns you are considering as factor
are actually character
variables and not factor
.
library(dplyr)
library(stringr)
mydata %>%
mutate(across(where(is.character),
~ if_else(stringi::stri_count_words(.x)>=3,
sub("^([a-z])", "\\U\\1\\E", .x, perl=T),
str_to_title(.x))))
# A tibble: 5 x 6
a b c d e id
<chr> <chr> <chr> <dbl> <chr> <dbl>
1 Yes School in Kenya R 10 Good Practice 1
2 No Home doing nothing Python 12 Doing Well 3
3 No School in Tanzania C 35 By Walking 4
4 Yes Home Practising C 65 Practising Everyday 5
5 No Home Java 78 Sitting Alone 7
CodePudding user response:
Here are two options. 1) you evaluate the >3 and <3 separately. Or 2) you evaluate both at the same time. Both have their pros and cons for readability.
library(tidyverse)
#option 1
mydata |>
mutate(across(where(\(x) length(levels(x)) >= 3),
\(x) fct_relabel(x, str_to_sentence)),
across(where(\(x) length(levels(x)) < 3),
\(x) fct_relabel(x, str_to_title)))
#> # A tibble: 5 x 6
#> a b c d e id
#> <fct> <fct> <fct> <fct> <fct> <fct>
#> 1 Yes School in kenya R 10 Good practice 1
#> 2 No Home doing nothing Python 12 Doing well 3
#> 3 No School in tanzania C 35 By walking 4
#> 4 Yes Home practising C 65 Practising everyday 5
#> 5 No Home Java 78 Sitting alone 7
#option 2
mydata |>
mutate(across(everything(), \(x) if_else(
rep(length(levels(x)) >= 3, length(x)),
fct_relabel(x, str_to_sentence),
fct_relabel(x, str_to_title)
)))
#> # A tibble: 5 x 6
#> a b c d e id
#> <fct> <fct> <fct> <fct> <fct> <fct>
#> 1 Yes School in kenya R 10 Good practice 1
#> 2 No Home doing nothing Python 12 Doing well 3
#> 3 No School in tanzania C 35 By walking 4
#> 4 Yes Home practising C 65 Practising everyday 5
#> 5 No Home Java 78 Sitting alone 7