I have a data frame like
date X1 X2 X3
4/16/2019 0:00 1 2 3
4/16/2019 7:00 1 2 3
4/172019 0:00 1 2 3
4/17/2019 7:00 1 2 3
I would like to get
date time X1 X2 X3
4/16/2019 c(0,7) c(1,1) c(2,2) c(3,3)
4/17/2019 c(0,7) c(1,1) c(2,2) c(3,3)
where X1
is a list and X1[[1]]
is a vector, that is c(1,1)
.
Is there an efficient way to achieve this? Thank you!
CodePudding user response:
Split the 'date' into 'date', 'time' columns at the space (\\s
), grouped by 'date', then summarise
across
all the columns by wrapping them in a list
library(dplyr)
library(tidyr)
library(stringr)
df1 %>%
separate(date, into = c('date', 'time'), sep = '\\s ') %>%
mutate(time = as.numeric(str_replace(time, ":", ".")) %>%
group_by(date) %>%
summarise(across(everything(), ~ list(.)))
-output
# A tibble: 2 × 5
date time X1 X2 X3
<chr> <list> <list> <list> <list>
1 4/16/2019 <dbl [2]> <int [2]> <int [2]> <int [2]>
2 4/17/2019 <dbl [2]> <int [2]> <int [2]> <int [2]>
data
df1 <- structure(list(date = c("4/16/2019 0:00", "4/16/2019 7:00",
"4/17/2019 0:00",
"4/17/2019 7:00"), X1 = c(1L, 1L, 1L, 1L), X2 = c(2L, 2L, 2L,
2L), X3 = c(3L, 3L, 3L, 3L)),
class = "data.frame", row.names = c(NA,
-4L))
CodePudding user response:
Here is an alternative way how you could do it: Logic:
- separate date and time column (other then with
separate
, as already provided by akrun) - group
- summarise with
across
usinglist
andlambda paste
(notice the.names
argument insummarise
- use again
across
andlambda paste0
library(dplyr)
library(readr)
df %>%
mutate(date = mdy_hm(date)) %>%
mutate(time = parse_number(sprintf("d", hour(date))), .before=2,
date = as.Date(ymd_hms(date))) %>%
group_by(date) %>%
summarise(across(everything(), list(~paste(.,collapse=",")), .names="{col}")) %>%
mutate(across(-date, ~paste0("c(",.,")")))
date time X1 X2 X3
<date> <chr> <chr> <chr> <chr>
1 2019-04-16 c(0,7) c(1,1) c(2,2) c(3,3)
2 2019-04-17 c(0,7) c(1,1) c(2,2) c(3,3)