Home > database >  How to create a subset for each day in time series data automatically
How to create a subset for each day in time series data automatically

Time:11-16

I have collected data from sensors and need to split the data into 1 day data frames automatically.

This is how my data looks like:

2   2021-10-20 20:17:14 151 -135.9  8.304   -339.5  8.175   23.13232    78.95514    97.10153
3   2021-10-20 20:27:15 152 -136.9  8.302   -340.6  8.175   23.89337    86.71063    98.07861
4   2021-10-20 20:37:15 153 -138.2  8.302   -340.5  8.177   23.00682    80.71004    96.15726
5   2021-10-20 20:47:16 154 -138.8  8.302   -341.0  8.176   23.76786    83.38557    98.30032 

I used tibbletime and dplyr to get a tbl_time object.

So far I was able to create a Subset for day1 and manually it would be easy to do it for day 2 and 3 etc. In the end I will have like a year of data, so this will be a pain in the a to do manually.

Here's the line of code I used:

day1<- filter_time(bio2_table, time_formula = '2021-10-20' ~ '2021-10-20')

I'm a r noob but I still want to believe there is a way that r does all the work itself.

Thanks!

CodePudding user response:

What data type is your date object?

The first step is to get the date format right, then you can group_split() to get a list of data frames, each of which represents a specific date.

If interested in this approach, you could look at the setNames() function to name each data frame.


data %>%
  mutate(date = as.Date(format(date, "%y-%m-%d")))%>%
  group_split(date)

CodePudding user response:

This returns a data frame with vectors as your grouped data:

t(aggregate( dat[,-2], by=list(dat$V2), list ))

        [,1]         [,2]         [,3]        
Group.1 "2021-10-20" "2021-10-21" "2021-10-22"
V1      Integer,4    6            7           
V3      Character,4  "20:47:16"   "20:47:16"  
V4      Integer,4    155          156         
V5      Numeric,4    -135.8       -131.8      
V6      Numeric,4    8.306        10.302      
V7      Numeric,4    -339         -347        
V8      Numeric,4    8.166        8.187       
V9      Numeric,4    25.76789     15.76786    
V10     Numeric,4    91.38557     87.38557    
V11     Numeric,4    102.3003     111.3003

# OR

aggregate( dat[,-2], by=list(dat$V2), list )

     Group.1         V1                                     V3
1 2021-10-20 2, 3, 4, 5 20:17:14, 20:27:15, 20:37:15, 20:47:16
2 2021-10-21          6                               20:47:16
3 2021-10-22          7                               20:47:16
                  V4                             V5                         V6
1 151, 152, 153, 154 -135.9, -136.9, -138.2, -138.8 8.304, 8.302, 8.302, 8.302
2                155                         -135.8                      8.306
3                156                         -131.8                     10.302
                              V7                         V8
1 -339.5, -340.6, -340.5, -341.0 8.175, 8.175, 8.177, 8.176
2                           -339                      8.166
3                           -347                      8.187
                                      V9                                    V10
1 23.13232, 23.89337, 23.00682, 23.76786 78.95514, 86.71063, 80.71004, 83.38557
2                               25.76789                               91.38557
3                               15.76786                               87.38557
                                     V11
1 97.10153, 98.07861, 96.15726, 98.30032
2                               102.3003
3                               111.3003

Data

dat <- structure(list(V1 = 2:7, V2 = c("2021-10-20", "2021-10-20", "2021-10-20", 
"2021-10-20", "2021-10-21", "2021-10-22"), V3 = c("20:17:14", 
"20:27:15", "20:37:15", "20:47:16", "20:47:16", "20:47:16"), 
    V4 = 151:156, V5 = c(-135.9, -136.9, -138.2, -138.8, -135.8, 
    -131.8), V6 = c(8.304, 8.302, 8.302, 8.302, 8.306, 10.302
    ), V7 = c(-339.5, -340.6, -340.5, -341, -339, -347), V8 = c(8.175, 
    8.175, 8.177, 8.176, 8.166, 8.187), V9 = c(23.13232, 23.89337, 
    23.00682, 23.76786, 25.76789, 15.76786), V10 = c(78.95514, 
    86.71063, 80.71004, 83.38557, 91.38557, 87.38557), V11 = c(97.10153, 
    98.07861, 96.15726, 98.30032, 102.30032, 111.30032)), class = "data.frame", row.names = c(NA, 
-6L))

dat
  V1         V2       V3  V4     V5     V6     V7    V8       V9      V10
1  2 2021-10-20 20:17:14 151 -135.9  8.304 -339.5 8.175 23.13232 78.95514
2  3 2021-10-20 20:27:15 152 -136.9  8.302 -340.6 8.175 23.89337 86.71063
3  4 2021-10-20 20:37:15 153 -138.2  8.302 -340.5 8.177 23.00682 80.71004
4  5 2021-10-20 20:47:16 154 -138.8  8.302 -341.0 8.176 23.76786 83.38557
5  6 2021-10-21 20:47:16 155 -135.8  8.306 -339.0 8.166 25.76789 91.38557
6  7 2021-10-22 20:47:16 156 -131.8 10.302 -347.0 8.187 15.76786 87.38557
        V11
1  97.10153
2  98.07861
3  96.15726
4  98.30032
5 102.30032
6 111.30032
  • Related