I have a data frame like this:
df <- data.frame(quadrant = rep(2:3, each = 3, times = 1),
ICD = c("S06|D11|A41|O34|O48",
"R55|A08|K40",
"N23|F13|K80|F19",
"R13|C18",
"E13|F19",
"D11|A41|N23|K80"))
I want to split up the string variable, generate a new row for each ICD while repeating the quadrant identifier.
Does somebody know a way how to handle this via dplyr::mutate
?
CodePudding user response:
You can use the following code:
df <- data.frame(quadrant = rep(2:3, each = 3, times = 1),
ICD = c("S06|D11|A41|O34|O48",
"R55|A08|K40",
"N23|F13|K80|F19",
"R13|C18",
"E13|F19",
"D11|A41|N23|K80"))
library(dplyr)
#> Warning: package 'dplyr' was built under R version 4.1.2
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(tidyr)
#> Warning: package 'tidyr' was built under R version 4.1.2
df %>%
dplyr::mutate(ICD = strsplit(as.character(ICD), "|", fixed = TRUE)) %>%
unnest(ICD)
#> # A tibble: 20 × 2
#> quadrant ICD
#> <int> <chr>
#> 1 2 S06
#> 2 2 D11
#> 3 2 A41
#> 4 2 O34
#> 5 2 O48
#> 6 2 R55
#> 7 2 A08
#> 8 2 K40
#> 9 2 N23
#> 10 2 F13
#> 11 2 K80
#> 12 2 F19
#> 13 3 R13
#> 14 3 C18
#> 15 3 E13
#> 16 3 F19
#> 17 3 D11
#> 18 3 A41
#> 19 3 N23
#> 20 3 K80
Created on 2022-06-30 by the reprex package (v2.0.1)
CodePudding user response:
I would use separate_rows
from tidyr
:
library(tidyr)
df |>
separate_rows(ICD, sep = "\\|")