I have a dataset that looks like this:
output |
---|
Others. Specify (separate by comma if there is more than one): |
Everyone cries/has feelings,Others. Specify (separate by comma if there is more than one): |
Family upbringing |
Everyone cries/has feelings,Others. Specify (separate by comma if there is more than one): |
Did not say |
How can I remove the sentence "Others. Specify (separate by comma if there is more than one):" from the dataset? I've tried
gsub("Others. Specify (separate by comma if there is more than one):", "", datset$output)
and str_remove_all() but it didn't work.
CodePudding user response:
You could achieve your desired result by adding fixed=TRUE
, which means to match the pattern as is
gsub("Others. Specify (separate by comma if there is more than one):",
"",
datset$output,
fixed = TRUE)
#> [1] "" "Everyone cries/has feelings,"
#> [3] "Family upbringing" "Everyone cries/has feelings,"
#> [5] "Did not say"
Second option would be to escape all special characters which in your case are the .
and in particualar the ()
, e.g. in a regex ()
are used to create a capturing group. Hence to match a e.g. (
you have to use \\(
:
gsub("Others\\. Specify \\(separate by comma if there is more than one\\):", "", datset$output)
DATA
datset <- data.frame(
output = c(
"Others. Specify (separate by comma if there is more than one):",
"Everyone cries/has feelings,Others. Specify (separate by comma if there is more than one):", "Family upbringing",
"Everyone cries/has feelings,Others. Specify (separate by comma if there is more than one):", "Did not say"
)
)