I want to remove second and subsequent occurrences of decimal point in string. My attempt is below:
library(stringr)
str_remove(string = "3.99-0.13", pattern = "\\.")
[1] "399-0.13"
sub("\\.", "", "3.99-0.13")
[1] "399-0.13"
However, I want the output like 3.99-013
. Any hint, please.
CodePudding user response:
We could locate all occurrences of .
's, exclude the first, and remove the rest:
library(stringi)
x <- c("3.99-0.13.2.2", "3.990.13")
stri_sub_replace_all(x,
lapply(stri_locate_all_regex(x, '\\.'), \(x) x[-1,, drop = FALSE]),
replacement = "")
Output:
"3.99-01322" "3.99013"
Note: stringr
's str_sub
doesn't seem to have a replace_all
option, so we'll need stringi
for this (that said, stringr::str_locate_all
could be used instead of stri_locate_all_regex
if you prefer).
Update: Now works with <= 2
occurrences.
CodePudding user response:
An approach with sub
and gsub
with simple regex patterns that works on a variety of possible inputs.
Extract the first part, then remove all dots from the second part, finally paste the two together.
Data
stri <- c("3.99-0.13", "393.99.0.13.0.0", ".832.723.723", "3.Ud.2349_3.",
"D.235.2")
stri
[1] "3.99-0.13" "393.99.0.13.0.0" ".832.723.723" "3.Ud.2349_3."
[5] "D.235.2"
apply
paste0(sub("\\..*", ".", stri), gsub("\\.", "", sub(".*?\\.", "", stri)))
[1] "3.99-013" "393.9901300" ".832723723" "3.Ud2349_3" "D.2352"