Home > Net >  Placing a dot in an alphanumeric column of a dataframe using R
Placing a dot in an alphanumeric column of a dataframe using R

Time:10-02

I am struggling with gsub and regular expressions in R and I need help with this. I have a data frame in R with the second column represent some codes presented as alphanumeric digits. I want to place a dot after three characters in codes comprising of four and five digits. Don't want to touch three-character codes, My input is,

ID code
1 C443
2 B479
3 E53
4 S9200
5 M8199

My required output is,

ID code
1 C44.3
2 B47.9
3 E53
4 S92.00
5 M81.99

I am trying, but getting a dot also in the code of 3rd ID

Library(dplyr)
a <- a %>% mutate(code = as.numeric(paste0(substr(code,1,3),".",substr(code,4,nchar(code)))))

Thanks for the help

CodePudding user response:

This is a nice way of doing so using RegEx:

a %>%
  mutate(code = gsub("(^[A-Z][0-9]{2})([0-9]{1,2})", "\\1\\.\\2", code))

CodePudding user response:

You could add an if_else to the existing code.

library(dplyr)

df <-
  data.frame(id = c(1, 2, 3, 4, 5),
             code = c("C443", "B479", "E53", "S9200", "M81999"))
df <-
  df %>% mutate(code = if_else(nchar(code) > 3, paste0(
    substr(code, 1, 3), ".", substr(code, 4, nchar(code))
  ), code))
df
#>   id    code
#> 1  1   C44.3
#> 2  2   B47.9
#> 3  3     E53
#> 4  4  S92.00
#> 5  5 M81.999

Created on 2021-10-01 by the reprex package (v2.0.1)

CodePudding user response:

Using str_replace

library(stringr)
library(dplyr)
df %>% 
    mutate(code = str_replace(code, "(\\d{2})(\\d )", "\\1.\\2"))
  id    code
1  1   C44.3
2  2   B47.9
3  3     E53
4  4  S92.00
5  5 M81.999

data

df <- structure(list(id = c(1, 2, 3, 4, 5), code = c("C443", "B479", 
"E53", "S9200", "M81999")), class = "data.frame", row.names = c(NA, 
-5L))
  • Related