Home > Net >  How to use repeated labels when grouping data using cut?
How to use repeated labels when grouping data using cut?

Time:03-07

I need to create a new column to categorize my experiment. The timeseries data is divided in 10 minutes group using cut. I want to add a column that loops between 3 labels say A, B, C and the goes back to A.

test<- tibble(date_time = seq.POSIXt(ymd_hms('2020-02-02 00:00:00'), 
              ymd_hms('2020-02-02 01:00:00'),
              by= '30 sec' ),
          cat = cut(date_time, breaks = '10 min'))

I want to get something like this

  date_time           cat                
  <dttm>              <fct>              
1 2020-02-02 00:00:00 A
2 2020-02-02 00:05:30 A
3 2020-02-02 00:10:00 B
4 2020-02-02 00:20:30 C
5 2020-02-02 00:30:00 A
6 2020-02-02 00:31:30 A

I have used the labels option in cut before with a known number of factors but not like this. Any help is welcome

CodePudding user response:

You could use cut(labels = F) to create numeric labels for your intervals, then use these as an index into the built-in LETTERS vector (or a vector of custom labels). The modulo operator and a little arithmetic will make it cycle through A, B, and C:

library(tidyverse)
library(lubridate)

test<- tibble(date_time = seq.POSIXt(ymd_hms('2020-02-02 00:00:00'), 
                                     ymd_hms('2020-02-02 01:00:00'),
                                     by= '30 sec' ),
              cat_num = cut(date_time, breaks = '10 min', labels = F),
              cat = LETTERS[((cat_num - 1) %% 3)   1]
)

   date_time           cat_num cat  
   <dttm>                <int> <chr>
 1 2020-02-02 00:00:00       1 A    
 2 2020-02-02 00:00:30       1 A    
 3 2020-02-02 00:01:00       1 A    
 4 2020-02-02 00:01:30       1 A    
 5 2020-02-02 00:02:00       1 A    
 6 2020-02-02 00:02:30       1 A    
 7 2020-02-02 00:03:00       1 A    
 8 2020-02-02 00:03:30       1 A    
 9 2020-02-02 00:04:00       1 A    
10 2020-02-02 00:04:30       1 A   
...48 more rows...
59 2020-02-02 00:29:00       3 C    
60 2020-02-02 00:29:30       3 C    
61 2020-02-02 00:30:00       4 A    
62 2020-02-02 00:30:30       4 A    
...59 more rows...
  •  Tags:  
  • r
  • Related