Home > OS >  How to drop specific characters in strings in a column?
How to drop specific characters in strings in a column?

Time:08-17

How can I drop "-" or double "--" only at the beginning of the value in the text column?

df <- data.frame (x  = c(12,14,15,178),
                  text = c("--Car","-Transport","Big-Truck","--Plane"))

    x       text
1  12      --Car
2  14 -Transport
3  15  Big-Truck
4 178    --Plane

Expected output:

    x       text
1  12        Car
2  14  Transport
3  15  Big-Truck
4 178      Plane

CodePudding user response:

You can use gsub and the following regex "^\\- ". ^ states that the match should be at the beginning of the string, and that it should be 1 or more ( ) hyphen (\\-).

gsub("^\\- ", "", df$text)
# [1] "Car"       "Transport" "Big-Truck" "Plane" 

If there are whitespaces in the beginning of the string and you want to remove them, you can use [ -] in your regex. It tells to match if there are repeated whitespaces or hyphens in the beginning of your string.

gsub("^[ -] ", "", df$text)

To apply this to the dataframe, just do this. In tidyverse, you can also use str_remove:

df$text <- gsub("^\\- ", "", df$text)
# or, in dplyr
library(tidyverse)
df %>% 
  mutate(text1 = gsub("^\\- ", "", text),
         text2 = str_remove(text, "^\\- "))

CodePudding user response:

You could use trimws to remove certain leading/trailing characters.

trimws(df$text, whitespace = '[ -]')

# [1] "Car"       "Transport" "Big-Truck" "Plane"
# a more complex situation
x <- " -- Car - -"

trimws(x, whitespace = '[ -]')
# [1] "Car"
  • Related