Home > Software design >  What regular expression function helps add a number to existing number? [duplicate]
What regular expression function helps add a number to existing number? [duplicate]

Time:09-21

I have a column in my dataset with a bunch of numbers that are either five or six numbers long like below. However, all these need to be six numbers long. The dataset I have is from combining multiple excels together and the lead "0" in the five number long numbers disappeared in some of the excel but not in other, so I need to readd it to the five number long ones. So I need a function that finds any number that is five numbers long and then adds a 0 at the beginning so "49302" turns into 049302", but leaves the six number long observations as is. Any thoughts on how using gsub or another function to do this on a large dataset?

49302 038503 594030 39845 503932 334922 49403 84034

CodePudding user response:

x <- c(49302,038503,594030,39845,503932,334922,49403,84034)

df <-
  tibble(x = x)

library(dplyr)

df %>% 
  mutate(x = if_else(nchar(x)==5,paste0(0,x),as.character(x)))

# A tibble: 8 x 1
  x     
  <chr> 
1 049302
2 038503
3 594030
4 039845
5 503932
6 334922
7 049403
8 084034

CodePudding user response:

We can use sprintf

df1$col1 <- sprintf('d', as.numeric(df1$col1))

-output

> df1
    col1
1 049302
2 038503
3 594030
4 039845
5 503932
6 334922
7 049403
8 084034

Or if we need a regex, capture the ones having 5 digits from the start (^) till the end ($) and replace with 0 and its backreference (\\1)

df1$col1 <-  sub("^(.....)$", "0\\1", df1$col1)
df1$col1
[1] "049302" "038503" "594030" "039845" "503932" "334922" "049403" "084034"

data

df1 <- data.frame(col1 = c(49302, 38503, 594030, 39845, 503932, 334922, 49403, 84034))
  • Related