I want to separate the variable "population" in two different columns. The first one ("pop1") must be composed by the first 2 values. The second one ("pop2"), the last value.
df <- dplyr::tibble(
city = c("a", "a", "b", "b", "c", "c"),
sex = c(1,0,1,0,1,0),
age = c(1,2,1,2,1,2),
population = c(100, 123, 189, 234, 221, 435),
accidents = c(87, 98, 79, 43,45,65)
)
Expected output
df <- dplyr::tibble(
city = c("a", "a", "b", "b", "c", "c"),
sex = c(1,0,1,0,1,0),
age = c(1,2,1,2,1,2),
pop1 = c(10, 12, 18, 23, 22, 43),
pop2 = c(0,3,9,4,1,5),
accidents = c(87, 98, 79, 43,45,65)
)
Thanks
CodePudding user response:
A possible solution:
library(tidyverse)
df %>%
separate(population, into = paste0("pop", 1:2), sep = "(?=\\d$)", convert = T)
#> # A tibble: 6 × 6
#> city sex age pop1 pop2 accidents
#> <chr> <dbl> <dbl> <int> <int> <dbl>
#> 1 a 1 1 10 0 87
#> 2 a 0 2 12 3 98
#> 3 b 1 1 18 9 79
#> 4 b 0 2 23 4 43
#> 5 c 1 1 22 1 45
#> 6 c 0 2 43 5 65
CodePudding user response:
Another solution based on extract
:
library(tidyr)
df %>%
extract(population,
into = c("pop1", "pop2"),
regex = "(\\d\\d)(\\d)")
# A tibble: 6 × 6
city sex age pop1 pop2 accidents
<chr> <dbl> <dbl> <chr> <chr> <dbl>
1 a 1 1 10 0 87
2 a 0 2 12 3 98
3 b 1 1 18 9 79
4 b 0 2 23 4 43
5 c 1 1 22 1 45
6 c 0 2 43 5 65