Home > front end >  Removing second and subsequent occurrences of decimal point in string
Removing second and subsequent occurrences of decimal point in string

Time:01-19

I want to remove second and subsequent occurrences of decimal point in string. My attempt is below:

library(stringr)
str_remove(string = "3.99-0.13", pattern = "\\.")
[1] "399-0.13"
sub("\\.", "", "3.99-0.13")
[1] "399-0.13"

However, I want the output like 3.99-013. Any hint, please.

CodePudding user response:

We could locate all occurrences of .'s, exclude the first, and remove the rest:

library(stringi)

x <- c("3.99-0.13.2.2", "3.990.13")

stri_sub_replace_all(x,
                     lapply(stri_locate_all_regex(x, '\\.'), \(x) x[-1,, drop = FALSE]),
                     replacement = "")

Output:

"3.99-01322" "3.99013"

Note: stringr's str_sub doesn't seem to have a replace_all option, so we'll need stringi for this (that said, stringr::str_locate_all could be used instead of stri_locate_all_regex if you prefer).

Update: Now works with <= 2 occurrences.

CodePudding user response:

An approach with sub and gsub with simple regex patterns that works on a variety of possible inputs.

Extract the first part, then remove all dots from the second part, finally paste the two together.

Data

stri <- c("3.99-0.13", "393.99.0.13.0.0", ".832.723.723", "3.Ud.2349_3.", 
"D.235.2")

stri
[1] "3.99-0.13"       "393.99.0.13.0.0" ".832.723.723"    "3.Ud.2349_3."   
[5] "D.235.2"

apply

paste0(sub("\\..*", ".", stri), gsub("\\.", "", sub(".*?\\.", "", stri)))
[1] "3.99-013"    "393.9901300" ".832723723"  "3.Ud2349_3"  "D.2352" 
  • Related