Home > Back-end >  Slice based on multiple date ranges and multiple columns to formating a new dataframe with R
Slice based on multiple date ranges and multiple columns to formating a new dataframe with R

Time:11-17

Let's say I have a sample data as follow:

df <- structure(list(date = structure(c(18912, 18913, 18914, 18915, 
18916, 18917, 18918, 18919, 18920, 18921, 18922, 18923), class = "Date"), 
    value1 = c(1.015, NA, NA, 1.015, 1.015, 1.015, 1.015, 1.015, 
    1.015, 1.015, 1.015, 1.015), value2 = c(1.115668, 1.104622, 
    1.093685, 1.082857, 1.072135, 1.06152, 1.05101, NA, NA, 1.0201, 
    1.01, 1), value3 = c(1.015, 1.030225, NA, NA, 1.077284, 1.093443, 
    1.109845, 1.126493, 1.14339, 1.160541, 1.177949, 1.195618
    )), row.names = c(NA, -12L), class = "data.frame")

and three date intervals:

date_range1 <- (date>='2021-10-12' & date<='2021-10-15')
date_range2 <- (date>='2021-10-16' & date<='2021-10-18')
date_range3 <- (date>='2021-10-21' & date<='2021-10-23')

I need to slice data from value1, value2 and value3 using date_range1, date_range2 and date_range3 respectively, and finally concatenate them to one column as follows:

enter image description here

Please note type 1, 2 and 3 are numbers to indicate date ranges: date_range1, date_range2 and date_range3.

How could I achieve that with R's packages? Thanks.

EDIT:

str(real_data)

Out:

tibble [1,537 x 5] (S3: tbl_df/tbl/data.frame)
 $ date                                                : chr [1:1537] "2008-01-31" "2008-02-29" "2008-03-31" "2008-04-30" ...
 $ value1: num [1:1537] 11.3 11.4 11.4 11.3 11.2 ...
 $ value2    : num [1:1537] 11.4 11.4 11.3 11.3 11.1 ...
 $ value3: num [1:1537] NA NA NA NA NA NA NA NA NA NA ...
 $ value4  : chr [1:1537] "11.60" "10.20" "12.55" "10.37" ...

CodePudding user response:

You may use use dplyr::case_when

library(dplyr)
df %>%
  mutate(type = case_when(
    date>='2021-10-12' & date<='2021-10-15' ~ 1,
    date>='2021-10-16' & date<='2021-10-18' ~ 2,
    date>='2021-10-21' & date<='2021-10-23' ~ 3,
    TRUE ~ NA_real_
  ),
  value = case_when(
    type == 1 ~ value1,
    type == 2 ~ value2,
    type == 3 ~ value3,
    TRUE ~ NA_real_
  ))  %>%
  select(date, value, type) %>%
  filter(!is.na(type))

         date    value type
1  2021-10-12 1.015000    1
2  2021-10-13       NA    1
3  2021-10-14       NA    1
4  2021-10-15 1.015000    1
5  2021-10-16 1.072135    2
6  2021-10-17 1.061520    2
7  2021-10-18 1.051010    2
8  2021-10-21 1.160541    3
9  2021-10-22 1.177949    3
10 2021-10-23 1.195618    3
  • Related