I have a dataframe that has a column (chr type) like this
col
"1,3,4,5"
"1,7,2,5"
"8,2,2,9"
How can I create 2 new variables so I can get the first and last element in the variable col using dplyr?
col. first last
"1,3,4,5" 1 5
"1,7,2,5" 1 5
"8,2,2,9" 8 9
CodePudding user response:
You can use a regular expression in that we delete all the elements between the commas.
read.table(text=sub(",.*,",' ', col))
V1 V2
1 1 5
2 1 5
3 8 9
data.frame(col) %>%
separate(col, c('v1', 'v2'), ',.*,')
v1 v2
1 1 5
2 1 5
3 8 9
ANother way:
a <- read.csv(text=col, h = F)
a[c(1,ncol(a))]
V1 V4
1 1 5
2 1 5
3 8 9
CodePudding user response:
A possible solution:
library(tidyverse)
df %>%
mutate(first = str_extract(col, "^\\d "),
last = str_extract(col, "\\d $"))
#> col first last
#> 1 1,2,3,4,5 1 5
#> 2 1,7,2,5 1 5
#> 3 8,2,2,9 8 9
Another possible solution:
library(tidyverse)
df %>%
mutate(id = row_number()) %>%
separate_rows(col, sep =",") %>%
group_by(id) %>%
summarise(first = first(col), last = last(col)) %>%
bind_cols(df, .) %>%
select(-id)
#> col first last
#> 1 1,3,4,5 1 5
#> 2 1,7,2,5 1 5
#> 3 8,2,2,9 8 9
CodePudding user response:
an optional approach using dplyr
and stringr
:
library(dplyr)
library(stringr)
df <- data.frame(col = c("1,3,4,5","1,7,2,5","8,2,2,9"))
df %>%
dplyr::mutate(first = stringr::str_sub(col, start = 1, end = 1),
last = stringr::str_sub(col, start = -1, end = -1))
col first last
1 1,3,4,5 1 5
2 1,7,2,5 1 5
3 8,2,2,9 8 9