Home > database >  concatenating list of strings when they're split
concatenating list of strings when they're split

Time:06-09

I have developed a function that reformulates text into a useable expression within a linear model formula. However, when these formulas are too long, they split up. Therefore, I am trying to concatenate those strings that are split.

With some example data:

[[4]]$X7
[1] "lifespan ~ exposure   danger   body   brain"

[[4]]$X8
[1] "lifespan ~ danger   body   brain   nondream"

[[4]]$X9
[1] "lifespan ~ body   brain   nondream   dream"


[[8]]$X8
[1] "lifespan ~ danger   body   brain   nondream   dream   sleep   "
[2] "    gestation   predation"                                     

[[8]]$X9
[1] "lifespan ~ body   brain   nondream   dream   sleep   gestation   "
[2] "    predation   exposure"                                         

lapply(formula_pred, lapply(x)if(length(x)>= 2){paste(x, collapse="")})

However, this would just paste all the strings in a list into one string per list.

For example, I get something like this:

[[1]]
[1] "lifespan ~ brainlifespan ~ nondreamlifespan ~ dreamlifespan ~ sleeplifespan ~ gestationlifespan ~ predationlifespan ~ exposurelifespan ~ dangerlifespan ~ body"

Expected output:

[[4]]$X7
[1] "lifespan ~ exposure   danger   body   brain"

[[4]]$X8
[1] "lifespan ~ danger   body   brain   nondream"

[[4]]$X9
[1] "lifespan ~ body   brain   nondream   dream"


[[8]]$X8
[1] "lifespan ~ danger   body   brain   nondream   dream   sleep   gestation   predation"                                     

[[8]]$X9
[1] "lifespan ~ body   brain   nondream   dream   sleep   gestation   predation   exposure"             

Reproducible code:

list(list(X1 = "lifespan ~ brain", X2 = "lifespan ~ nondream", 
    X3 = "lifespan ~ dream", X4 = "lifespan ~ sleep", X5 = "lifespan ~ gestation", 
    X6 = "lifespan ~ predation", X7 = "lifespan ~ exposure", 
    X8 = "lifespan ~ danger", X9 = "lifespan ~ body"), list(X1 = c("lifespan ~ brain   nondream   dream   sleep   gestation   predation   ", 
"    exposure   danger"), X2 = c("lifespan ~ nondream   dream   sleep   gestation   predation   ", 
"    exposure   danger   body"), X3 = c("lifespan ~ dream   sleep   gestation   predation   exposure   ", 
"    danger   body   brain"), X4 = c("lifespan ~ sleep   gestation   predation   exposure   danger   ", 
"    body   brain   nondream"), X5 = c("lifespan ~ gestation   predation   exposure   danger   body   ", 
"    brain   nondream   dream"), X6 = c("lifespan ~ predation   exposure   danger   body   brain   nondream   ", 
"    dream   sleep"), X7 = c("lifespan ~ exposure   danger   body   brain   nondream   dream   ", 
"    sleep   gestation"), X8 = c("lifespan ~ danger   body   brain   nondream   dream   sleep   ", 
"    gestation   predation"), X9 = c("lifespan ~ body   brain   nondream   dream   sleep   gestation   ", 
"    predation   exposure")))

CodePudding user response:

Does this help?

library(dplyr)
as.data.frame(dt) %>%
  mutate(across(matches("\\.\\d"), ~paste0(trimws(.), collapse = " ")))
                X1                  X2               X3               X4                   X5                   X6
1 lifespan ~ brain lifespan ~ nondream lifespan ~ dream lifespan ~ sleep lifespan ~ gestation lifespan ~ predation
2 lifespan ~ brain lifespan ~ nondream lifespan ~ dream lifespan ~ sleep lifespan ~ gestation lifespan ~ predation
                   X7                X8              X9
1 lifespan ~ exposure lifespan ~ danger lifespan ~ body
2 lifespan ~ exposure lifespan ~ danger lifespan ~ body
                                                                                     X1.1
1 lifespan ~ brain   nondream   dream   sleep   gestation   predation   exposure   danger
2 lifespan ~ brain   nondream   dream   sleep   gestation   predation   exposure   danger
                                                                                    X2.1
1 lifespan ~ nondream   dream   sleep   gestation   predation   exposure   danger   body
2 lifespan ~ nondream   dream   sleep   gestation   predation   exposure   danger   body
                                                                                 X3.1
1 lifespan ~ dream   sleep   gestation   predation   exposure   danger   body   brain
2 lifespan ~ dream   sleep   gestation   predation   exposure   danger   body   brain
                                                                                    X4.1
1 lifespan ~ sleep   gestation   predation   exposure   danger   body   brain   nondream
2 lifespan ~ sleep   gestation   predation   exposure   danger   body   brain   nondream
                                                                                    X5.1
1 lifespan ~ gestation   predation   exposure   danger   body   brain   nondream   dream
2 lifespan ~ gestation   predation   exposure   danger   body   brain   nondream   dream
                                                                                X6.1
1 lifespan ~ predation   exposure   danger   body   brain   nondream   dream   sleep
2 lifespan ~ predation   exposure   danger   body   brain   nondream   dream   sleep
                                                                                X7.1
1 lifespan ~ exposure   danger   body   brain   nondream   dream   sleep   gestation
2 lifespan ~ exposure   danger   body   brain   nondream   dream   sleep   gestation
                                                                                 X8.1
1 lifespan ~ danger   body   brain   nondream   dream   sleep   gestation   predation
2 lifespan ~ danger   body   brain   nondream   dream   sleep   gestation   predation
                                                                                   X9.1
1 lifespan ~ body   brain   nondream   dream   sleep   gestation   predation   exposure
2 lifespan ~ body   brain   nondream   dream   sleep   gestation   predation   exposure

CodePudding user response:

In the OP's code, there is a syntax issue - i.e it shows x but there was no lambda function (function(x) or \(x) - in newer versions). In addition, it is a nested list, so we may need two lambda functions

lapply(formula_pred, function(x) 
  lapply(x, function(y) paste(trimws(y), collapse = ' ')))

Or if we want to make use of if/else (not really needed as it gives the same result)

lapply(formula_pred, function(x) 
  lapply(x, function(y)
     if(length(y) >=2) paste(trimws(y), collapse = '') else y))
  •  Tags:  
  • r
  • Related