assuming this is my df
df <- tibble(`a*`=c("_x__", "*y", "z -"),
b=c("_x__", "*y", "z -"))
> df
# A tibble: 3 x 2
`a*` b
<chr> <chr>
1 _x__ _x__
2 *y *y
3 z - z -
I want to remove *, _
and
characters from both column names and values if exist to get
# A tibble: 3 x 2
a b
<chr> <chr>
1 x x
2 y y
3 z- z-
so I am using gsub()
, but it only removes the first character. in fact I am looking for a pretty way to achieve both these changes using dply r pipes. Any hint or idea is appreciated.
df %>%
mutate_all(funs(gsub(c("_","[*]"," "),"",.)))
names(df) <- str_remove_all("[*]")
CodePudding user response:
We can pass multiple characters to match within []
in str_remove
or gsub
. But, not a vector of patterns in gsub
as pattern
is not vectorized in gsub
library(dplyr)
library(stringr)
df <- df %>%
transmute(across(everything(), str_remove_all,
pattern = "[*_ ]", .names = "{str_remove_all(.col, '[*_ ]')}"))
-output
df
# A tibble: 3 × 2
a b
<chr> <chr>
1 x x
2 y y
3 z- z-
CodePudding user response:
This does the names as well but is pretty similar to akrun's answer:
library(dplyr)
pattern = "\\*|\\ |_"
df |>
mutate(across(
.fns = \(col) gsub(pattern, "", col)
)) |>
setNames(gsub(pattern, "", names(df)))
# A tibble: 3 x 2
# a b
# <chr> <chr>
# 1 x x
# 2 y y
# 3 z- z-