How to you reshape a string delimited by a tab or space into a long format? The string (called label
here) can be of different lengths.
I have this
var label
1 work 100 101
2 sleep 500 409 200
and I want this
var code
1 work 100
2 work 101
3 sleep 500
4 sleep 409
5 sleep 200
# data
df = data.frame(var = c("work", 'sleep'), label = c('100 101', '500 409 200'))
CodePudding user response:
library(tidyr)
df %>%
separate_rows(label)
# A tibble: 5 x 2
var label
<chr> <chr>
1 work 100
2 work 101
3 sleep 500
4 sleep 409
5 sleep 200
CodePudding user response:
Using strsplit
in Map
Map(cbind, df$var, strsplit(df$label, ' ')) |> do.call(what=rbind.data.frame)
# V1 V2
# work.1 work 100
# work.2 work 101
# sleep.1 sleep 500
# sleep.2 sleep 409
# sleep.3 sleep 200
or in by
.
by(df, rev(df$var), \(x) with(x, cbind(var, code=el(strsplit(label, split=' '))))) |>
do.call(what=rbind.data.frame)
# var code
# sleep.1 work 100
# sleep.2 work 101
# work.1 sleep 500
# work.2 sleep 409
# work.3 sleep 200
CodePudding user response:
A great answer was already posted. But let's say you had a strange delimiter, like this:
df = data.frame(var = c("work", 'sleep'), label = c('100-gh-101', '500-gh-409-gh-200'))
In that case, you could use regex and unnest()
:
df %>%
mutate(label2 = strsplit(label, "-gh-")) %>%
unnest(label2)
var label label2
<chr> <chr> <chr>
1 work 100--101 100
2 work 100--101 101
3 sleep 500-gh-409-gh-200 500
4 sleep 500-gh-409-gh-200 409
5 sleep 500-gh-409-gh-200 200