Home > Software design >  Transforming a dataframe R by colum start and end
Transforming a dataframe R by colum start and end

Time:01-04

I have a dataframe as follows.

> df
  condition duration start end
1         A        2     3   4
2         B        3     8  10
3         A        2     7   8

I want to transform the dataframe into a table that gives me a 0 or 1 for each position for each condition. The conditions can overlap at the start and end and occur multiple times; however, the multiple occurrences of a type of condition don't.

It should look like something as follows

> df2
   count A B
1      1 0 0
2      2 0 0
3      3 1 0
4      4 1 0
5      5 0 0
6      6 0 0
7      7 1 0
8      8 1 1
9      9 0 1
10    10 0 1

CodePudding user response:

library(dplyr)
library(tidyr)
df %>% 
  mutate(count = mapply(`:`, start, end), .keep = "unused") %>% 
  unnest(count) %>% 
  pivot_wider(-duration, names_from = condition, values_from = condition, values_fn = length, values_fill = 0) %>% 
  complete(count = seq(count), fill = list(A = 0, B = 0)) %>% 
  arrange(count)

output

# A tibble: 10 × 3
   count     A     B
   <int> <int> <int>
 1     1     0     0
 2     2     0     0
 3     3     1     0
 4     4     1     0
 5     5     0     0
 6     6     0     0
 7     7     1     0
 8     8     1     1
 9     9     0     1
10    10     0     1

CodePudding user response:

library(dplyr); library(tidyr)
df |>
  uncount(duration, .id = "copy") |>
  mutate(row = start   copy - 1) |>
  count(condition, row) |>
  complete(condition, row = 1:max(row), fill = list(n = 0)) |>
  pivot_wider(names_from = condition, values_from = n) 

Result

# A tibble: 10 × 3
     row     A     B
   <dbl> <int> <int>
 1     1     0     0
 2     2     0     0
 3     3     1     0
 4     4     1     0
 5     5     0     0
 6     6     0     0
 7     7     1     0
 8     8     1     1
 9     9     0     1
10    10     0     1
  • Related