I have a list of names belonging the dataset mammalsleep
, and I want to replace those names that have additional characters on the name.
For example:
pr_replace <- paste(c('log(brain)','I(body^2)'), collapse="|")
extract_replace <- paste(c('brain','body'), collapse="|")
We replace extract_replace
for pr_replace
.
I have tried two ways of doing this:
lapply(per, function(dat)
sapply(dat, function(x)
str_replace(x, extract_replace, pr_replace)) %>% data.frame())
Would instead replace the values when found with
X9
1 exposure danger log(brain)|I(body^2)
2 danger log(brain)|I(body^2) log(brain)|I(body^2)
3 log(brain)|I(body^2) log(brain)|I(body^2) nondream
4 log(brain)|I(body^2) nondream dream
5 nondream dream sleep
6 dream sleep gestation
7 sleep gestation predation
8 gestation predation exposure
9 predation exposure danger
I have also tried:
pr_r<-c('log(brain)','I(body^2)')
mapply(function(x, y)
lapply(x, function(dat)
sapply(dat, function(z)
str_replace(z, extract_replace, y)) %>% data.frame()), per, pr_r, SIMPLIFY = FALSE)
However, this does not produce the results I am after.
Expected output:
when values brain
are found we should get log(brain)
, when body
is found we should get I(body^2)
in replacement.
Expected output:
[[1]]
X1 X2 X3 X4 X5 X6 X7 X8 X9
1 log(brain) nondream dream sleep gestation predation exposure danger I(body^2)
2 nondream dream sleep gestation predation exposure danger I(body^2) log(brain)
3 dream sleep gestation predation exposure danger body log(brain) nondream
4 sleep gestation predation exposure danger I(body^2) log(brain) nondream dream
5 gestation predation exposure danger I(body^2) brain nondream dream sleep
[[2]]
X1 X2 X3 X4 X5 X6 X7 X8 X9
1 log(brain) nondream dream sleep gestation predation exposure danger I(body^2)
2 nondream dream sleep gestation predation exposure danger I(body^2) log(brain)
3 dream sleep gestation predation exposure danger body log(brain) nondream
4 sleep gestation predation exposure danger I(body^2) brain nondream dream
5 gestation predation exposure danger I(body^2) log(brain) nondream dream sleep
6 predation exposure danger I(body^2) brain nondream dream sleep gestation
UPDATE:
When trying to use it on a vector, for example the names of the datasets. Say for example, I want log(X1)
to change to X1
, this fails when trying the following:
pr_replace <- c('log(X1)', 'log(X8)')
extract_replace <- c('X1', 'X8')
lapply(per, names) %>% map(., ~ .x %>% str_replace_all(.x, setNames(extract_replace, pr_replace)))
reproducible code (updated):
per <- list(structure(list(`log(X1)` = c("brain", "nondream", "dream",
"sleep", "gestation"), X2 = c("nondream", "dream", "sleep", "gestation",
"predation"), X3 = c("dream", "sleep", "gestation", "predation",
"exposure"), X4 = c("sleep", "gestation", "predation", "exposure",
"danger"), X5 = c("gestation", "predation", "exposure", "danger",
"body"), X6 = c("predation", "exposure", "danger", "body", "brain"
), X7 = c("exposure", "danger", "body", "brain", "nondream"),
`log(X8)` = c("danger", "body", "brain", "nondream", "dream"
), X9 = c("body", "brain", "nondream", "dream", "sleep")), row.names = c(NA,
5L), class = "data.frame"), structure(list(`log(X1)` = c("brain",
"nondream", "dream", "sleep", "gestation", "predation"), X2 = c("nondream",
"dream", "sleep", "gestation", "predation", "exposure"), X3 = c("dream",
"sleep", "gestation", "predation", "exposure", "danger"), X4 = c("sleep",
"gestation", "predation", "exposure", "danger", "body"), X5 = c("gestation",
"predation", "exposure", "danger", "body", "brain"), X6 = c("predation",
"exposure", "danger", "body", "brain", "nondream"), X7 = c("exposure",
"danger", "body", "brain", "nondream", "dream"), `log(X8)` = c("danger",
"body", "brain", "nondream", "dream", "sleep"), X9 = c("body",
"brain", "nondream", "dream", "sleep", "gestation")), row.names = c(NA,
6L), class = "data.frame"))
CodePudding user response:
Instead of paste
ing the elements in the replacement
(which literally process it compared to the evaluation in pattern
for |
), we can create two vectors
or a single named vector where the names should match the substring in the original data to replace the values from the vector
pr_replace <- c('log(brain)','I(body^2)')
extract_replace <- c('brain','body')
named_vec <- setNames(pr_replace, extract_replace)
Now, we loop over the list
with map
, loop across
the columns of the datasets and apply str_replace
with a named vector
library(purrr)
library(stringr)
library(dplyr)
per <- map(per, ~ .x %>%
mutate(across(everything(), ~ str_replace_all(.x,
named_vec))))
-output
per
[[1]]
X1 X2 X3 X4 X5 X6 X7 X8 X9
1 log(brain) nondream dream sleep gestation predation exposure danger I(body^2)
2 nondream dream sleep gestation predation exposure danger I(body^2) log(brain)
3 dream sleep gestation predation exposure danger I(body^2) log(brain) nondream
4 sleep gestation predation exposure danger I(body^2) log(brain) nondream dream
5 gestation predation exposure danger I(body^2) log(brain) nondream dream sleep
[[2]]
X1 X2 X3 X4 X5 X6 X7 X8 X9
1 log(brain) nondream dream sleep gestation predation exposure danger I(body^2)
2 nondream dream sleep gestation predation exposure danger I(body^2) log(brain)
3 dream sleep gestation predation exposure danger I(body^2) log(brain) nondream
4 sleep gestation predation exposure danger I(body^2) log(brain) nondream dream
5 gestation predation exposure danger I(body^2) log(brain) nondream dream sleep
6 predation exposure danger I(body^2) log(brain) nondream dream sleep gestation
For the updated case with column names, wrap with fixed
as well as there are metacharacters (()
) along with partial matching
map(per, ~ str_replace_all(names(.x),
fixed(setNames(extract_replace, pr_replace))))
[[1]]
[1] "X1" "X2" "X3" "X4" "X5" "X6" "X7" "X8" "X9"
[[2]]
[1] "X1" "X2" "X3" "X4" "X5" "X6" "X7" "X8" "X9"