Home > OS >  How to manipulate digits in a character string in R?
How to manipulate digits in a character string in R?

Time:11-11

I feel like I have a super easy question but for the life of me I can't find it when googling or searching here (or I don't know the correct terms to find a solution) so here goes.

I have a large amount of text in R in which I want to identify all numbers/digits, and add a specific number to them, for example 5.

So just as a small example, if this were my text:

text <- c("Hi. It is 6am. I want to leave at 7am")

I want the output to be:

> text
[1] "Hi. It is 11am.  I want to leave at 12am"

But also I need the addition for each individual digit, so if this is the text:

text <- c("Hi. It is 2017. I am 35 years old.")

...I want the output to be:

> text
[1] "Hi. It is 75612. I am 810 years old."

I have tried 'grabbing' the numbers from the string and adding 5, but I don't know how to then get them back into the original string so I can get the full text back.

How should I go about this? Thanks in advance!

CodePudding user response:

Here is how I would do the time. I would search for a number that is followed by am or pm and then sub in a math expression to be evaluated by gsubfn. This is pretty flexible, but would require whole hours in its current implementation. I added an am and pm if you wanted to swap those, but I didn't try to code in detecting if the number changes from am to pm. Also note that I didn't code in rolling from 12 to 1. If you add numbers over 12, you will get a number bigger than 12.

text1 <- c("Hi. It is 6am. I want to leave at 7am")
text2 <- c("It is 9am. I want to leave at 10am, but the cab comes at 11am. Can I push my flight to 12am?")

change_time <- function(text, hours, sign, am_pm){
  string_change <- glue::glue("`(\\1{sign}{hours})`{am_pm}")
  
  gsub("(\\d )(?=am|pm)(am|pm)", string_change, text, perl = TRUE)|>
  gsubfn::fn$c()
}

change_time(text = text1, hours = 5, sign = " ", am_pm = "am")
#> [1] "Hi. It is 11am. I want to leave at 12am"

change_time(text = text2, hours = 3, sign = "-", am_pm = "pm")
#> [1] "It is 6pm. I want to leave at 7pm, but the cab comes at 8pm. Can I push my flight to 9pm?"

CodePudding user response:

text1 <- c("Hi. It is 2017. I am 35 years old.")
text2 <- c("Hi. It is 6am. I want to leave at 7am")

change_number <- function(text, change, sign){   
  string_change <- glue::glue("`(\\1{sign}{change})`")
  gsub("(\\d)", string_change, text, perl = TRUE) %>%
    gsubfn::fn$c() }

change_number(text = text1, change = 5, sign = " ")
#>[1] "Hi. It is 75612. I am 810 years old."

change_number(text = text2, change = 5, sign = " ")
#>[1] "Hi. It is 11am. I want to leave at 12am"

This works perfectly. Many thanks to @AndS., I tweaked (or rather, simplified) your code to fit my needs better. I was determined to figure out the other text myself haha, so thanks for showing me how!

CodePudding user response:

Something quick and dirty with base R:

add_n = \(x, n, by_digit = FALSE) {
  if (by_digit) ptrn = "[0-9]" else ptrn = "[0-9] "
  tmp       = gregexpr(ptrn, x)
  raw       = regmatches(x, gregexpr(ptrn, x))
  raw_plusn = lapply(raw, \(x) as.integer(x)   n)
  for (i in seq_along(x)) {
    regmatches(x[i], tmp[i]) <- raw_plusn[i]
  }
  x
}

text = c(
  "Hi. It is 6am. I want to leave at 7am", 
  "wow it's 505 dollars and 19 cents",
  "Hi. It is 2017. I am 35 years old."
)

> add_n(text, 5)
# [1] "Hi. It is 11am. I want to leave at 12am"
# [2] "wow it's 510 dollars and 24 cents"      
# [3] "Hi. It is 2022. I am 40 years old."     

> add_n(text, -2)
# [1] "Hi. It is 4am. I want to leave at 5am" "wow it's 503 dollars and 17 cents"    
# [3] "Hi. It is 2015. I am 33 years old."   

> add_n(text, 5, by_digit = TRUE)
# [1] "Hi. It is 11am. I want to leave at 12am"
# [2] "wow it's 10510 dollars and 614 cents"   
# [3] "Hi. It is 75612. I am 810 years old."  
  • Related