R Can I get the first element for every row in a dataframe-CodePudding

I have a dataframe that has a column (chr type) like this

col
"1,3,4,5"
"1,7,2,5"
"8,2,2,9"

How can I create 2 new variables so I can get the first and last element in the variable col using dplyr?

col.       first last
"1,3,4,5"  1     5   
"1,7,2,5"  1     5 
"8,2,2,9"  8     9

CodePudding user response：

You can use a regular expression in that we delete all the elements between the commas.

read.table(text=sub(",.*,",' ', col))

  V1 V2
1  1  5
2  1  5
3  8  9

 data.frame(col) %>%
  separate(col, c('v1', 'v2'), ',.*,')
  v1 v2
1  1  5
2  1  5
3  8  9

ANother way:

a <- read.csv(text=col, h = F)
a[c(1,ncol(a))]
  V1 V4
1  1  5
2  1  5
3  8  9

CodePudding user response：

A possible solution:

library(tidyverse)

df %>% 
  mutate(first = str_extract(col, "^\\d "),
     last = str_extract(col, "\\d $"))

#>         col first last
#> 1 1,2,3,4,5     1    5
#> 2   1,7,2,5     1    5
#> 3   8,2,2,9     8    9

Another possible solution:

library(tidyverse)

df %>% 
  mutate(id = row_number()) %>% 
  separate_rows(col, sep =",") %>% 
  group_by(id) %>% 
  summarise(first = first(col), last = last(col)) %>% 
  bind_cols(df, .) %>% 
  select(-id)

#>       col first last
#> 1 1,3,4,5     1    5
#> 2 1,7,2,5     1    5
#> 3 8,2,2,9     8    9

CodePudding user response：

an optional approach using dplyr and stringr:

library(dplyr)
library(stringr)

df <- data.frame(col = c("1,3,4,5","1,7,2,5","8,2,2,9"))

df %>%
    dplyr::mutate(first = stringr::str_sub(col, start = 1, end = 1),
                  last = stringr::str_sub(col, start = -1, end = -1))

      col first last
1 1,3,4,5     1    5
2 1,7,2,5     1    5
3 8,2,2,9     8    9