How can I drop "-" or double "--" only at the beginning of the value in the text column?
df <- data.frame (x = c(12,14,15,178),
text = c("--Car","-Transport","Big-Truck","--Plane"))
x text
1 12 --Car
2 14 -Transport
3 15 Big-Truck
4 178 --Plane
Expected output:
x text
1 12 Car
2 14 Transport
3 15 Big-Truck
4 178 Plane
CodePudding user response:
You can use gsub
and the following regex "^\\- "
. ^
states that the match should be at the beginning of the string, and that it should be 1 or more (
) hyphen (\\-
).
gsub("^\\- ", "", df$text)
# [1] "Car" "Transport" "Big-Truck" "Plane"
If there are whitespaces in the beginning of the string and you want to remove them, you can use [ -]
in your regex. It tells to match if there are repeated whitespaces or hyphens in the beginning of your string.
gsub("^[ -] ", "", df$text)
To apply this to the dataframe, just do this. In tidyverse
, you can also use str_remove
:
df$text <- gsub("^\\- ", "", df$text)
# or, in dplyr
library(tidyverse)
df %>%
mutate(text1 = gsub("^\\- ", "", text),
text2 = str_remove(text, "^\\- "))
CodePudding user response:
You could use trimws
to remove certain leading/trailing characters.
trimws(df$text, whitespace = '[ -]')
# [1] "Car" "Transport" "Big-Truck" "Plane"
# a more complex situation
x <- " -- Car - -"
trimws(x, whitespace = '[ -]')
# [1] "Car"