I am struggling with gsub and regular expressions in R and I need help with this. I have a data frame in R with the second column represent some codes presented as alphanumeric digits. I want to place a dot after three characters in codes comprising of four and five digits. Don't want to touch three-character codes, My input is,
ID | code |
---|---|
1 | C443 |
2 | B479 |
3 | E53 |
4 | S9200 |
5 | M8199 |
My required output is,
ID | code |
---|---|
1 | C44.3 |
2 | B47.9 |
3 | E53 |
4 | S92.00 |
5 | M81.99 |
I am trying, but getting a dot also in the code of 3rd ID
Library(dplyr)
a <- a %>% mutate(code = as.numeric(paste0(substr(code,1,3),".",substr(code,4,nchar(code)))))
Thanks for the help
CodePudding user response:
This is a nice way of doing so using RegEx:
a %>%
mutate(code = gsub("(^[A-Z][0-9]{2})([0-9]{1,2})", "\\1\\.\\2", code))
CodePudding user response:
You could add an if_else
to the existing code.
library(dplyr)
df <-
data.frame(id = c(1, 2, 3, 4, 5),
code = c("C443", "B479", "E53", "S9200", "M81999"))
df <-
df %>% mutate(code = if_else(nchar(code) > 3, paste0(
substr(code, 1, 3), ".", substr(code, 4, nchar(code))
), code))
df
#> id code
#> 1 1 C44.3
#> 2 2 B47.9
#> 3 3 E53
#> 4 4 S92.00
#> 5 5 M81.999
Created on 2021-10-01 by the reprex package (v2.0.1)
CodePudding user response:
Using str_replace
library(stringr)
library(dplyr)
df %>%
mutate(code = str_replace(code, "(\\d{2})(\\d )", "\\1.\\2"))
id code
1 1 C44.3
2 2 B47.9
3 3 E53
4 4 S92.00
5 5 M81.999
data
df <- structure(list(id = c(1, 2, 3, 4, 5), code = c("C443", "B479",
"E53", "S9200", "M81999")), class = "data.frame", row.names = c(NA,
-5L))