Home > Blockchain >  How to merge/stack observations by date in R
How to merge/stack observations by date in R

Time:10-18

I have a data frame like

date             X1 X2 X3
4/16/2019 0:00   1  2  3
4/16/2019 7:00   1  2  3
4/172019 0:00    1  2  3
4/17/2019 7:00   1  2  3

I would like to get

date        time     X1      X2      X3
4/16/2019   c(0,7)   c(1,1)  c(2,2)  c(3,3)
4/17/2019   c(0,7)   c(1,1)  c(2,2)  c(3,3)

where X1 is a list and X1[[1]] is a vector, that is c(1,1).

Is there an efficient way to achieve this? Thank you!

CodePudding user response:

Split the 'date' into 'date', 'time' columns at the space (\\s ), grouped by 'date', then summarise across all the columns by wrapping them in a list

library(dplyr)
library(tidyr)
library(stringr)
df1 %>%   
   separate(date, into = c('date', 'time'), sep = '\\s ') %>%
   mutate(time = as.numeric(str_replace(time, ":", ".")) %>%
   group_by(date) %>%
   summarise(across(everything(), ~ list(.)))

-output

# A tibble: 2 × 5
  date      time      X1        X2        X3       
  <chr>     <list>    <list>    <list>    <list>   
1 4/16/2019 <dbl [2]> <int [2]> <int [2]> <int [2]>
2 4/17/2019 <dbl [2]> <int [2]> <int [2]> <int [2]>

data

df1 <- structure(list(date = c("4/16/2019 0:00", "4/16/2019 7:00", 
"4/17/2019 0:00", 
"4/17/2019 7:00"), X1 = c(1L, 1L, 1L, 1L), X2 = c(2L, 2L, 2L, 
2L), X3 = c(3L, 3L, 3L, 3L)), 
class = "data.frame", row.names = c(NA, 
-4L))

CodePudding user response:

Here is an alternative way how you could do it: Logic:

  1. separate date and time column (other then with separate, as already provided by akrun)
  2. group
  3. summarise with across using list and lambda paste (notice the .names argument in summarise
  4. use again across and lambda paste0
library(dplyr)
library(readr)
df %>% 
  mutate(date = mdy_hm(date)) %>% 
  mutate(time = parse_number(sprintf("d", hour(date))), .before=2,
         date = as.Date(ymd_hms(date))) %>% 
  group_by(date) %>% 
  summarise(across(everything(), list(~paste(.,collapse=",")), .names="{col}")) %>% 
  mutate(across(-date, ~paste0("c(",.,")")))
  date       time   X1     X2     X3    
  <date>     <chr>  <chr>  <chr>  <chr> 
1 2019-04-16 c(0,7) c(1,1) c(2,2) c(3,3)
2 2019-04-17 c(0,7) c(1,1) c(2,2) c(3,3)
  • Related