Home > front end >  Generate all possible combinations of a text string with two specific letters substituted for each o
Generate all possible combinations of a text string with two specific letters substituted for each o

Time:06-23

Using R I have generated several strings of letters that range from 6-25 characters. I'd like for each one to generate an output that consists of all the combinations of these strings with every "I" substituted for a "L" and vice versa, the order of the characters should stay the same.

eg.

Input

"IVGLWEA"

OUTPUT

"IVGLWEA" "LVGLWEA" "LVGIWEA" 'IVGIWEA" "LVGLWEA"

many thanks

rob

CodePudding user response:

Edit: Thanks to @Skaqqs for the dynamic solution!

string <- "IVGLWEA"

# find the number of I's and L's in the string 
n <- length(unlist(gregexpr("I|L", string)))
# make a grid of all possible combinations with this amount of I's and L's
df <- expand.grid(rep(list(c("I", "L")), n))

# replace I's and L's with %s
string_ <- gsub("I|L", "\\%s", string)
# replace %s with letters in grid
do.call(sprintf, as.list(c(string_, df)))

Result:

[1] "IVGIWEA" "LVGIWEA" "IVGLWEA" "LVGLWEA"

CodePudding user response:

Here's an extremely inefficient (but concise!) approach:

Create all potential combinations of your input characters and use regex to extract the desired pattern.

pattern <- "(I|L)VG(I|L)WEA"
b <- c("I", "V", "G", "L", "W", "E", "A")

strings <- apply(expand.grid(rep(list(b), 7)), 1, paste0, collapse = "")
grep(pattern, strings, value = TRUE)
[1] "IVGIWEA" "LVGIWEA" "IVGLWEA" "LVGLWEA"

CodePudding user response:

Here is a solution in R. I'm sure others can provide something more concise and versatile, but I think this code works for your specific scenario of finding every combination of replacement for two letters. The result is returned in a single vector, but you can remove unlist() if you want each word separated.

I hope this helps!

# The function below works only for 2 target letters - 
# it would have to be extended to work with more
target_letters <- c("I", "L")
words <- c("IVGLWEA", "SIINFEKL")

unlist(lapply(words, function(word) {
  all_letters <- strsplit(word, '')[[1]]
  target_letter_positions <- which(all_letters %in% target_letters)
  all_letters[target_letter_positions] <- target_letters[1]
  c(paste(all_letters, collapse = ''),
    lapply(seq_len(length(target_letter_positions)), function(x) {
      combn(
        x = target_letter_positions,
        m = x,
        FUN = function(y) {
          s <- all_letters
          s[y] <- target_letters[2]
          paste(s, collapse = '')
        }
      )
    })
  )
}))

[1] "IVGIWEA" "LVGIWEA" "IVGLWEA" "LVGLWEA" "SIINFEKI" "SLINFEKI" "SILNFEKI" "SIINFEKL" "SLLNFEKI" "SLINFEKL" "SILNFEKL" "SLLNFEKL"

You can strip back the outer lapply to simplify if you only want to use it on one word at a time, and can turn it into a function to be applied to a table column.

All the best, IW

  • Related