Home > front end >  Removing words after "-" except for specified strings
Removing words after "-" except for specified strings

Time:09-28

Is there a straightforward way to remove all colours but keep all the car model (i.e. "X-12" and "X - 12")? I have tried using str_remove() but I can't seem to remove the colours without getting rid of "X-12" and "X - 12".

x <- c("Car X-12 - Red",
       "Car One - Blue",
       "Car X - 12 - Black and Green",
       "Car Two - Yellow",
       "Car Three - Purple and Red",
       "Car Four - Olive",
       "Car Five - Orange",
       "Car X-12 - Black and White")

desired_output <- c("Car X-12",
                    "Car One",
                    "Car X - 12",
                    "Car Two",
                    "Car Three",
                    "Car Four",
                    "Car Five",
                    "Car X-12")

Created on 2021-09-28 by the reprex package (v2.0.0)

Session info
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value                       
#>  version  R version 4.0.2 (2020-06-22)
#>  os       macOS  10.16                
#>  system   x86_64, darwin17.0          
#>  ui       X11                         
#>  language (EN)                        
#>  collate  en_AU.UTF-8                 
#>  ctype    en_AU.UTF-8                 
#>  tz       Australia/Melbourne         
#>  date     2021-09-28                  
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version date       lib source        
#>  backports     1.2.1   2020-12-09 [1] CRAN (R 4.0.2)
#>  cli           3.0.1   2021-07-17 [1] CRAN (R 4.0.2)
#>  crayon        1.4.1   2021-02-08 [1] CRAN (R 4.0.2)
#>  digest        0.6.28  2021-09-23 [1] CRAN (R 4.0.2)
#>  ellipsis      0.3.2   2021-04-29 [1] CRAN (R 4.0.2)
#>  evaluate      0.14    2019-05-28 [1] CRAN (R 4.0.1)
#>  fansi         0.5.0   2021-05-25 [1] CRAN (R 4.0.2)
#>  fs            1.5.0   2020-07-31 [1] CRAN (R 4.0.2)
#>  glue          1.4.2   2020-08-27 [1] CRAN (R 4.0.2)
#>  highr         0.9     2021-04-16 [1] CRAN (R 4.0.2)
#>  htmltools     0.5.1.1 2021-01-22 [1] CRAN (R 4.0.2)
#>  knitr         1.34    2021-09-09 [1] CRAN (R 4.0.2)
#>  lifecycle     1.0.0   2021-02-15 [1] CRAN (R 4.0.2)
#>  magrittr      2.0.1   2020-11-17 [1] CRAN (R 4.0.2)
#>  pillar        1.6.2   2021-07-29 [1] CRAN (R 4.0.2)
#>  pkgconfig     2.0.3   2019-09-22 [1] CRAN (R 4.0.2)
#>  purrr         0.3.4   2020-04-17 [1] CRAN (R 4.0.2)
#>  reprex        2.0.0   2021-04-02 [1] CRAN (R 4.0.2)
#>  rlang         0.4.11  2021-04-30 [1] CRAN (R 4.0.2)
#>  rmarkdown     2.7     2021-02-19 [1] CRAN (R 4.0.2)
#>  rstudioapi    0.13    2020-11-12 [1] CRAN (R 4.0.2)
#>  sessioninfo   1.1.1   2018-11-05 [1] CRAN (R 4.0.2)
#>  stringi       1.7.4   2021-08-25 [1] CRAN (R 4.0.2)
#>  stringr       1.4.0   2019-02-10 [1] CRAN (R 4.0.2)
#>  styler        1.4.1   2021-03-30 [1] CRAN (R 4.0.2)
#>  tibble        3.1.4   2021-08-25 [1] CRAN (R 4.0.2)
#>  utf8          1.2.2   2021-07-24 [1] CRAN (R 4.0.2)
#>  vctrs         0.3.8   2021-04-29 [1] CRAN (R 4.0.2)
#>  withr         2.4.2   2021-04-18 [1] CRAN (R 4.0.2)
#>  xfun          0.26    2021-09-14 [1] CRAN (R 4.0.2)
#>  yaml          2.2.1   2020-02-01 [1] CRAN (R 4.0.2)
#> 
#> [1] /Library/Frameworks/R.framework/Versions/4.0/Resources/library

CodePudding user response:

If colors always appears at last, how about trying this way?

sapply(x, function(t){
  y <- str_split(t, " - ", simplify = TRUE)
  paste(y[-length(y)], collapse = "-")
})

 Car X-12 - Red               Car One - Blue Car X - 12 - Black and Green             Car Two - Yellow 
                  "Car X-12"                    "Car One"                   "Car X-12"                    "Car Two" 
  Car Three - Purple and Red             Car Four - Olive            Car Five - Orange   Car X-12 - Black and White 
                 "Car Three"                   "Car Four"                   "Car Five"                   "Car X-12" 

CodePudding user response:

You can use sub to extract text before the last underscore.

sub('(.*)\\s -.*', '\\1', x)

#[1] "Car X-12"   "Car One"    "Car X - 12" "Car Two"   
#[5] "Car Three"  "Car Four"   "Car Five"   "Car X-12" 
  • Related