How do I reverse characters within nested brackets in R?-CodePudding

I am trying to solve the following problem:

"Write a function that reverses characters in (possibly nested) parentheses in the input string.

Input strings will always be well-formed with matching ()s.

Example

For

 inputString = "(bar)",

the output should be

 solution(inputString) = "rab";

For

 inputString = "foo(bar)baz",

the output should be

 solution(inputString) = "foorabbaz";

For

 inputString = "foo(bar)baz(blim)",

the output should be

 solution(inputString) = "foorabbazmilb";

For

 inputString = "foo(bar(baz))blim",

the output should be

 solution(inputString) = "foobazrabblim".

Because

"foo(bar(baz))blim"

becomes

"foo(barzab)blim"

and then

"foobazrabblim".

Now I have managed to solve the problem for the simple case when there is just one pair of brackets – i.e. unnested and without a second pair. My code:

solution <- function(inputString) {
  a <- unlist(strsplit(x=inputString,split=""))
  bracket.indices <- grep(pattern="\\(|\\)",x=a)
  a[(bracket.indices[1]   1): (bracket.indices[2] - 1)] <- rev(a[(bracket.indices[1]   1): (bracket.indices[2] - 1)])
  return(paste(a <- a[-bracket.indices]))
}

So I first split the string so that I can access individual elements by indices. Next, I use grep to identify the indices of the brackets, and then I use those indices to access the characters within the brackets and reverse them, using rev(). Finally, I get rid of the brackets and then use paste() to collapse the split string back down into a normal string. Obviously, if there is a second pair of brackets – e.g. we have

 inputString = "foo(bar)baz(blim)"

my code won't work because I've assumed bracket.indices has just two elements and accessed them accordingly. What's more, my code obviously won't work for nested brackets because the contents of nested brackets need to be reversed altogether with the contents of outer brackets.

Probably in solving the problem for this simple case I have just distorted the proper solution, but since the larger problem is a bit baffling to me, going about it in the simple case is the best place I could think to start. Any help? (Base R would be preferred)

CodePudding user response：

1) Assuming that input is a character string x, that any (...) occurrence contains only a mix of word characters and other (...) and that there are no unbalanced parentheses then while there exists a ( in it, match and reverse the strings consisting of word characters (\w -- see ?regex for definition) within the inner parentheses using gsubfn. gsubfn is like gsub except the replacement string can be a function which inputs capture groups in the match and outputs the replacement.

The strrep function defined below reverses a string. See https://www.r-bloggers.com/2019/05/four-ways-to-reverse-a-string-in-r/ and How to Reverse a string in R for that and several other ways to reverse a string.

library(gsubfn)

strrev <- function(x) intToUtf8(rev(utf8ToInt(x)))

rev_paren <- function(x) {
  while(grepl("(", x, fixed = TRUE)) {
    x <- gsubfn("\\((\\w*?)\\)", strrev, x)
  }
  x
}

rev_paren("foo(bar(baz))blim")
## [1] "foobazrabblim"

2) A variation without loops that uses recursive calculation instead would be:

library(gsubfn)

strrev <- function(x) intToUtf8(rev(utf8ToInt(x)))

rev_paren <- function(x) {
  if (grepl("(", x, fixed = TRUE))
    Recall(gsubfn("\\((\\w*?)\\)", strrev, x))
  else x
}

rev_paren("foo(bar(baz))blim")
## [1] "foobazrabblim"

3) Here is a base solution. It is longer than those above but has no dependencies.

strrev <- function(x) intToUtf8(rev(utf8ToInt(x)))

rev_paren <- function(x) {
  while(grepl("(", x, fixed = TRUE)) {
    s <- strcapture("\\((\\w*)\\)", x, list(character(0)))[[1]]
    x <- sub(sprintf("(%s)", s), strrev(s), x, fixed = TRUE)
  }
  x
}

rev_paren("foo(bar(baz))blim")
## [1] "foobazrabblim"

character vector

In any of these cases we can use the following if v is a character vector.

sapply(v, rev_paren)

Vectorize(rev_paren)(v)