Extracting the first element from strsplit, applied across each row element in data.table in R-CodePudding

I have the following dataset:

library(data.table)

x <- data.table(a = c(1:3, 1), b = c('12 13', '14 15', '16 17', '18 19'))
> x
   a     b
1: 1 12 13
2: 2 14 15
3: 3 16 17
4: 1 18 19

and I would like to get a new dataset which has

> x
   a  b     c
1: 1 12 13 12
2: 2 14 15 14
3: 3 16 17 16
4: 1 18 19 19

so that it takes the first element of column b's elements. I tried to do

x[,c:=unlist(strsplit(b, " "))[[1]][1]]

but it doesn't work. Is there a way to apply such a thing in data.table?

CodePudding user response：

We can use sapply() along with strsplit() and retain the first element from each vector in the list.

x$c <- sapply(strsplit(x$b, " "), `[[`, 1)
x

   a     b  c
1: 1 12 13 12
2: 2 14 15 14
3: 3 16 17 16
4: 1 18 19 18

CodePudding user response：

Use tstrsplit from data.table:

x[,c := tstrsplit(b," ")[1]]
x
   a     b  c
1: 1 12 13 12
2: 2 14 15 14
3: 3 16 17 16
4: 1 18 19 18



x[, c := readr::parse_number(b)]
x 
   a     b  c
1: 1 12 13 12
2: 2 14 15 14
3: 3 16 17 16
4: 1 18 19 18

CodePudding user response：

You can use stringr::str_split_i to take the first element of each split string:

library(stringr)
x[, c := str_split_i(b, " ", 1)]
x
   a     b  c
1: 1 12 13 12
2: 2 14 15 14
3: 3 16 17 16
4: 1 18 19 18