I have the following data example:
structure(list(name = c("2020-12-02 02_05_24.143926", "2020-12-02 04_05_44.370258",
"2020-12-02 08_06_25.214121", "2020-12-02 10_06_45.697784", "2020-12-02 14_07_25.747003",
"2020-12-02 16_07_46.002571", "2020-12-02 20_08_25.838364", "2020-12-02 22_08_45.705227",
"2020-12-03 02_09_25.384941", "2020-12-03 04_09_44.709639", "2020-12-03 08_10_23.097440",
"2020-12-03 10_10_42.111583", "2020-12-03 14_11_20.193122", "2020-12-03 16_11_39.252692",
"2020-12-03 20_12_17.340138", "2020-12-03 22_12_36.086608", "2020-12-04 02_15_27.387402",
"2020-12-04 04_15_46.375845", "2020-12-04 08_16_24.414194", "2020-12-04 10_16_43.215919",
"2020-12-31 10_06_26.083394", "2020-12-31 10_36_30.720992", "2020-12-31 14_07_03.081910",
"2020-12-31 14_37_07.718933", "2020-12-31 16_07_21.515981", "2020-12-31 16_37_26.054783",
"2020-12-31 20_07_58.646942", "2020-12-31 20_38_03.155509", "2020-12-31 22_08_17.181192",
"2020-12-31 22_38_21.847135", "2021-01-01 02_08_54.245043", "2021-01-01 02_38_58.905204",
"2021-01-01 04_09_13.055522", "2021-01-01 04_39_17.797032", "2021-01-01 08_09_50.080337",
"2021-01-01 08_39_54.646102", "2021-01-01 10_10_08.580802", "2021-01-01 10_40_13.262391",
"2021-01-01 14_10_45.513987", "2021-01-01 14_40_50.152527", "2021-01-01 16_11_03.966316",
"2021-01-01 16_41_08.595758", "2021-01-01 20_11_41.136895", "2021-01-01 20_41_45.807547",
"2021-01-01 22_11_59.897654", "2021-01-01 22_42_04.619130", "2021-01-02 02_12_37.503054",
"2021-01-02 02_42_42.155622", "2021-01-02 04_12_56.127958", "2021-01-02 04_43_00.807846",
"2021-01-02 08_13_33.280704")), row.names = c(NA, -51L), class = c("data.table",
"data.frame")>)
This data consists of a date and time (It's not necessary to define it as date and time). However I would like to split it by specific dates/values that matched, for example: 1 datatable with data/values before 2020-12-31, between 2020-12-31 and 01-01-2021 and after 01-01-2021.
Thanks all
CodePudding user response:
On possible way to solve your problem:
library(data.table)
breaks = as.Date(c("2020-12-31", "2021-01-01"))
split(df, findInterval(as.Date(substr(df$name, 1, 10)), breaks))
$`0`
name
<char>
1: 2020-12-02 02_05_24.143926
2: 2020-12-02 04_05_44.370258
3: 2020-12-02 08_06_25.214121
4: 2020-12-02 10_06_45.697784
5: 2020-12-02 14_07_25.747003
6: 2020-12-02 16_07_46.002571
7: 2020-12-02 20_08_25.838364
8: 2020-12-02 22_08_45.705227
9: 2020-12-03 02_09_25.384941
10: 2020-12-03 04_09_44.709639
11: 2020-12-03 08_10_23.097440
12: 2020-12-03 10_10_42.111583
13: 2020-12-03 14_11_20.193122
14: 2020-12-03 16_11_39.252692
15: 2020-12-03 20_12_17.340138
16: 2020-12-03 22_12_36.086608
17: 2020-12-04 02_15_27.387402
18: 2020-12-04 04_15_46.375845
19: 2020-12-04 08_16_24.414194
20: 2020-12-04 10_16_43.215919
name
$`1`
name
<char>
1: 2020-12-31 10_06_26.083394
2: 2020-12-31 10_36_30.720992
3: 2020-12-31 14_07_03.081910
4: 2020-12-31 14_37_07.718933
5: 2020-12-31 16_07_21.515981
6: 2020-12-31 16_37_26.054783
7: 2020-12-31 20_07_58.646942
8: 2020-12-31 20_38_03.155509
9: 2020-12-31 22_08_17.181192
10: 2020-12-31 22_38_21.847135
$`2`
name
<char>
1: 2021-01-01 02_08_54.245043
2: 2021-01-01 02_38_58.905204
3: 2021-01-01 04_09_13.055522
4: 2021-01-01 04_39_17.797032
5: 2021-01-01 08_09_50.080337
6: 2021-01-01 08_39_54.646102
7: 2021-01-01 10_10_08.580802
8: 2021-01-01 10_40_13.262391
9: 2021-01-01 14_10_45.513987
10: 2021-01-01 14_40_50.152527
11: 2021-01-01 16_11_03.966316
12: 2021-01-01 16_41_08.595758
13: 2021-01-01 20_11_41.136895
14: 2021-01-01 20_41_45.807547
15: 2021-01-01 22_11_59.897654
16: 2021-01-01 22_42_04.619130
17: 2021-01-02 02_12_37.503054
18: 2021-01-02 02_42_42.155622
19: 2021-01-02 04_12_56.127958
20: 2021-01-02 04_43_00.807846
21: 2021-01-02 08_13_33.280704
name
CodePudding user response:
split(
DT,
DT[, fcase(name < "2020-12-31", 1, name <= "2021-01-01", 2, default = 3)]
)
CodePudding user response:
lubridate is a helpful package for working with dates and times. Saving the given structure to variable 'dt' these subsets can be generated as follows:
library(lubridate)
library(data.table)
setDT(dt)
dt[,datetime:=ymd_hms(name)]
dt1 <- dt[datetime < ymd("2020-12-31")]
dt2 <- dt[datetime