I have a list of strings in R contaminated with some undesirable characters "X." and ".", like this:
"age", ".name", "X.marks", "X.study.time", "class", "X.number"
And I want to parse the string data to:
"age", "name", "marks", "study time", "class", "number"
Meaning, I want to remove "X." if it exists and substitute every "." for " " (space). How can I do this in R?
CodePudding user response:
We may use sub
gsub(".", " ", sub("^X?\\.", "", v1), fixed = TRUE)
[1] "age" "name" "marks" "study time" "class" "number"
data
v1 <- c("age", ".name", "X.marks", "X.study.time", "class", "X.number")
CodePudding user response:
You can do the desired substitution with the str_replace_all
function from the stringr
package. Using the v1
object posted by akrun:
library(stringr)
# Replace all "X." by nothing and all "." not preceded by "X" by spaces
str_replace_all(v1, c("X\\." = "", "(?<!X)\\." = " "))
# "age" " name" "marks" "study time" "class" "number"
CodePudding user response:
Here is another stringr
solution combining two functions:
library(stringr)
str_trim(str_replace_all(v1, "\\.|X", " "))
[1] "age" "name" "marks" "study time" "class" "number"