library(dplyr)
library(data.table)
library(stringr)
test = c('a1b1', 'a2b2', 'a3b3')
result = rbind(c(1,1),
c(2,2),
c(3,3))
result
[,1] [,2]
[1,] 1 1
[2,] 2 2
[3,] 3 3
test2<-do.call(rbind,test %>% str_split('a'))
test3<-do.call(rbind,test2 %>% .[,2] %>% str_split('b'))
test3
[,1] [,2]
[1,] "1" "1"
[2,] "2" "2"
[3,] "3" "3"
- do.call(rbind, data) is not equal rbindlist(data) ? data.table::rbindlist is not working. If I want to use rbindlist, what can I do?
rbindlist(test %>% str_split('a'))
Error in rbindlist(test %>% str_split("a")) :
Item 1 of input is not a data.frame, data.table or list
CodePudding user response:
If you use tstrsplit
rather than str_split
, they will be columns already rather than rows, so you can use as.data.table
rather than rbind
ing them together.
test = c('a1b1', 'a2b2', 'a3b3')
library(data.table)
as.data.table(tstrsplit(tstrsplit(test, 'a')[[2]], 'b'))
#> V1 V2
#> <char> <char>
#> 1: 1 1
#> 2: 2 2
#> 3: 3 3
Created on 2022-02-17 by the reprex package (v2.0.1)
This will be much faster, e.g. < 1 second vs 18 seconds if the vector has 10,000 elements.
test = c('a1b1', 'a2b2', 'a3b3')
library(data.table)
library(stringr)
library(bench)
test <- sample(test, 1e5, TRUE)
mark(
tstrsplit =
as.data.table(tstrsplit(tstrsplit(test, 'a')[[2]], 'b'))
,
str_split = {
test2 <- rbindlist(test %>% str_split("a") %>% lapply(., function(x)
as.data.table(t(x))))
rbindlist(as.matrix(test2) %>% .[,2] %>% str_split("b") %>% lapply(., function(x)
as.data.table(t(x))))
}
)
#> Warning: Some expressions had a GC in every iteration; so filtering is disabled.
#> # A tibble: 2 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 tstrsplit 134.8ms 138.7ms 7.16 9.54MB 1.79
#> 2 str_split 18.8s 18.8s 0.0532 3.11GB 2.66
Created on 2022-02-17 by the reprex package (v2.0.1)
CodePudding user response:
If you want to use a similar approach using rbindlist
, then you could do something like below. Essentially, you can add in a step to to turn each item in the list into a data.table
(but need to transpose first).
library(dplyr)
library(data.table)
library(stringr)
test2 <- rbindlist(test %>% str_split("a") %>% lapply(., function(x)
as.data.table(t(x))))
test3 <- rbindlist(as.matrix(test2) %>% .[,2] %>% str_split("b") %>% lapply(., function(x)
as.data.table(t(x))))
Output
test3
V1 V2
1: 1 1
2: 2 2
3: 3 3