Home > database >  Using mutate across to enforce class of columns matching a certain name
Using mutate across to enforce class of columns matching a certain name

Time:09-14

I have a list column of tibbles, and in some of them I want to enforce that a column with a certain name, is a character vector. I've tried to do this with tidyverse but I'm not getting any joy.

Here's a minimal example:

library(tibble)
library(dplyr)

data <- list(
  tibble(A = 1:10, B = 1:10, C = 1:10),
  tibble(A = 1:10, "Direct written by (3)" = 1:10, C = 1:10),
  tibble(A = 1:10, B = 1:10, C = 1:10)
)

newData <- purrr::map(data, function(dataTable){
  mutate(dataTable, across(matches("Direct written by (3)"), ~as.character(.x)))
})

It seems like the above should do it, but it does not:

> str(data)
List of 3
 $ : tibble [10 × 3] (S3: tbl_df/tbl/data.frame)
  ..$ A: int [1:10] 1 2 3 4 5 6 7 8 9 10
  ..$ B: int [1:10] 1 2 3 4 5 6 7 8 9 10
  ..$ C: int [1:10] 1 2 3 4 5 6 7 8 9 10
 $ : tibble [10 × 3] (S3: tbl_df/tbl/data.frame)
  ..$ A                    : int [1:10] 1 2 3 4 5 6 7 8 9 10
  ..$ Direct written by (3): int [1:10] 1 2 3 4 5 6 7 8 9 10
  ..$ C                    : int [1:10] 1 2 3 4 5 6 7 8 9 10
 $ : tibble [10 × 3] (S3: tbl_df/tbl/data.frame)
  ..$ A: int [1:10] 1 2 3 4 5 6 7 8 9 10
  ..$ B: int [1:10] 1 2 3 4 5 6 7 8 9 10
  ..$ C: int [1:10] 1 2 3 4 5 6 7 8 9 10
> str(newData)
List of 3
 $ : tibble [10 × 3] (S3: tbl_df/tbl/data.frame)
  ..$ A: int [1:10] 1 2 3 4 5 6 7 8 9 10
  ..$ B: int [1:10] 1 2 3 4 5 6 7 8 9 10
  ..$ C: int [1:10] 1 2 3 4 5 6 7 8 9 10
 $ : tibble [10 × 3] (S3: tbl_df/tbl/data.frame)
  ..$ A                    : int [1:10] 1 2 3 4 5 6 7 8 9 10
  ..$ Direct written by (3): int [1:10] 1 2 3 4 5 6 7 8 9 10
  ..$ C                    : int [1:10] 1 2 3 4 5 6 7 8 9 10
 $ : tibble [10 × 3] (S3: tbl_df/tbl/data.frame)
  ..$ A: int [1:10] 1 2 3 4 5 6 7 8 9 10
  ..$ B: int [1:10] 1 2 3 4 5 6 7 8 9 10
  ..$ C: int [1:10] 1 2 3 4 5 6 7 8 9 10

How can I acheive the desired behaviour?

CodePudding user response:

We may use contains instead of matches as there are characters (()) which may be evaluated in regex mode as metacharacters (or else have to either escape (\\) or wrap with fixed

library(purrr)
library(dplyr)
map(data, ~ .x %>% 
   mutate(across(contains("Direct written by (3)"), 
        as.character)))

-output

[[1]]
# A tibble: 10 × 3
       A     B     C
   <int> <int> <int>
 1     1     1     1
 2     2     2     2
 3     3     3     3
 4     4     4     4
 5     5     5     5
 6     6     6     6
 7     7     7     7
 8     8     8     8
 9     9     9     9
10    10    10    10

[[2]]
# A tibble: 10 × 3
       A `Direct written by (3)`     C
   <int> <chr>                   <int>
 1     1 1                           1
 2     2 2                           2
 3     3 3                           3
 4     4 4                           4
 5     5 5                           5
 6     6 6                           6
 7     7 7                           7
 8     8 8                           8
 9     9 9                           9
10    10 10                         10

[[3]]
# A tibble: 10 × 3
       A     B     C
   <int> <int> <int>
 1     1     1     1
 2     2     2     2
 3     3     3     3
 4     4     4     4
 5     5     5     5
 6     6     6     6
 7     7     7     7
 8     8     8     8
 9     9     9     9
10    10    10    10

matches with fixed

map(data, ~ .x %>% 
   mutate(across(matches(fixed("Direct written by (3)")), 
        as.character)))
  • Related