Home > Enterprise >  batch create columns by lapply and regex to a column in data.table of R
batch create columns by lapply and regex to a column in data.table of R

Time:01-06

I want to get values after some strings, demo as below

dt <- data.table(col.1 = c("a1, b2, c3, d4"))
x <- c("a", "b", "c")

dt[, (x) := lapply(FUN = str_match(string = .SD, 
                                   pattern = paste0("(?<=", x, ")([\\d])"))[, 2], 
                   X = x),
   .SDcols = "col.1"]

The desirable result looks like this

desirable <- data.table(col.1 = c("a1, b2, c3, d4"),
                        a = c("1"),
                        b = c("2"),
                        c = c("3"))

I got error message as below:

*Error in match.fun(FUN) :

c("'str_match(string = .SD, pattern = paste0(\"(?<=\", x, \")([\\\\d])\"))[, ' is not a function, character or symbol", "'    2]' is not a function, character or symbol")*

But I couldn't figure out how to fix this proble. Can anyone give me some hins?

CodePudding user response:

Loop over the patterns and extract the value with str_match

library(data.table)
library(stringr)
dt[, (x) := lapply(paste0("(?<=", x, ")(\\d )"),
     \(x) str_match(col.1, x)[, 2])]
            col.1 a b c
1: a1, b2, c3, d4 1 2 3

Or with strcapture

pat <- paste0(sprintf("%s(\\d )", x), collapse = ".*")
cbind(dt, dt[, strcapture(pat, col.1, setNames(rep(list(integer()), 3), x))])
            col.1 a b c
1: a1, b2, c3, d4 1 2 3
  • Related