Home > Mobile >  Grouping rows of a dataframe by day in a list of dataframes
Grouping rows of a dataframe by day in a list of dataframes

Time:05-26

I have the following dataframe

>dput(x)
structure(list(id = c("gii3myemeqez57e", "g4k42mvj1xo7x6v", 
"gii3myemeqez57e", "gii3myemeqez57e", "qwfw4nta0j5yf6x", "n8ygk9pdlxjvxh1", 
"gii3myemeqez57e", "w72qyv1ybi1sf9n", "w72qyv1ybi1sf9n", "n8ygk9pdlxjvxh1"
), p = c("ilcaffe.tv", "youtube.com", "ilcaffe.tv", 
"ilcaffe.tv", "il24.it", "tecnicadellascuola.it", "ilcaffe.tv", 
"change.org", "change.org", "tecnicadellascuola.it"), z = c("2018-02-18 17:50:00.000", 
"2018-02-07 10:20:00.000", "2018-02-18 17:50:00.000", "2018-02-18 17:50:00.000", 
"2018-03-03 21:50:00.000", "2018-02-20 00:00:00.000", "2018-02-18 17:50:00.000", 
"2018-02-22 08:30:00.000", "2018-02-22 08:30:00.000", "2018-02-20 00:00:00.000"
)), row.names = c(NA, 10L), class = "data.frame")

which is:

                 id          p                                    z
1   gii3myemeqez57e ilcaffe.tv              2018-02-18 17:50:00.000
2   g4k42mvj1xo7x6v youtube.com             2018-02-07 10:20:00.000
3   gii3myemeqez57e ilcaffe.tv              2018-02-18 17:50:00.000
4   gii3myemeqez57e ilcaffe.tv              2018-02-18 17:50:00.000
5   qwfw4nta0j5yf6x il24.it                 2018-03-03 21:50:00.000
6   n8ygk9pdlxjvxh1 tecnicadellascuola.it   2018-02-20 00:00:00.000
7   gii3myemeqez57e ilcaffe.tv              2018-02-18 17:50:00.000
8   w72qyv1ybi1sf9n change.org              2018-02-22 08:30:00.000
9   w72qyv1ybi1sf9n change.org              2018-02-22 08:30:00.000
10  n8ygk9pdlxjvxh1 tecnicadellascuola.it   2018-02-20 00:00:00.000

I would like to rearrange this dataframe in a list of dataframes grouping the rows which show the same day in the column z. For example, rows 1, 3, 4 and 7 should form a dataframe; row 2 should form another dataframe and so on.

CodePudding user response:

x <- structure(list(id = c("gii3myemeqez57e", "g4k42mvj1xo7x6v", 
                           "gii3myemeqez57e", "gii3myemeqez57e", "qwfw4nta0j5yf6x", "n8ygk9pdlxjvxh1", 
                           "gii3myemeqez57e", "w72qyv1ybi1sf9n", "w72qyv1ybi1sf9n", "n8ygk9pdlxjvxh1"), 
                    p = c("ilcaffe.tv", "youtube.com", "ilcaffe.tv", 
                          "ilcaffe.tv", "il24.it", "tecnicadellascuola.it", "ilcaffe.tv", 
                          "change.org", "change.org", "tecnicadellascuola.it"), 
                    z = c("2018-02-18 17:50:00.000", 
                          "2018-02-07 10:20:00.000", "2018-02-18 17:50:00.000", "2018-02-18 17:50:00.000", 
                          "2018-03-03 21:50:00.000", "2018-02-20 00:00:00.000", "2018-02-18 17:50:00.000", 
                          "2018-02-22 08:30:00.000", "2018-02-22 08:30:00.000", "2018-02-20 00:00:00.000"
                    )), row.names = c(NA, 10L), class = "data.frame")

dplyr::group_split(x, day = as.Date(z))
#> <list_of<
#>   tbl_df<
#>     id : character
#>     p  : character
#>     z  : character
#>     day: date
#>   >
#> >[5]>
#> [[1]]
#> # A tibble: 1 × 4
#>   id              p           z                       day       
#>   <chr>           <chr>       <chr>                   <date>    
#> 1 g4k42mvj1xo7x6v youtube.com 2018-02-07 10:20:00.000 2018-02-07
#> 
#> [[2]]
#> # A tibble: 4 × 4
#>   id              p          z                       day       
#>   <chr>           <chr>      <chr>                   <date>    
#> 1 gii3myemeqez57e ilcaffe.tv 2018-02-18 17:50:00.000 2018-02-18
#> 2 gii3myemeqez57e ilcaffe.tv 2018-02-18 17:50:00.000 2018-02-18
#> 3 gii3myemeqez57e ilcaffe.tv 2018-02-18 17:50:00.000 2018-02-18
#> 4 gii3myemeqez57e ilcaffe.tv 2018-02-18 17:50:00.000 2018-02-18
#> 
#> [[3]]
#> # A tibble: 2 × 4
#>   id              p                     z                       day       
#>   <chr>           <chr>                 <chr>                   <date>    
#> 1 n8ygk9pdlxjvxh1 tecnicadellascuola.it 2018-02-20 00:00:00.000 2018-02-20
#> 2 n8ygk9pdlxjvxh1 tecnicadellascuola.it 2018-02-20 00:00:00.000 2018-02-20
#> 
#> [[4]]
#> # A tibble: 2 × 4
#>   id              p          z                       day       
#>   <chr>           <chr>      <chr>                   <date>    
#> 1 w72qyv1ybi1sf9n change.org 2018-02-22 08:30:00.000 2018-02-22
#> 2 w72qyv1ybi1sf9n change.org 2018-02-22 08:30:00.000 2018-02-22
#> 
#> [[5]]
#> # A tibble: 1 × 4
#>   id              p       z                       day       
#>   <chr>           <chr>   <chr>                   <date>    
#> 1 qwfw4nta0j5yf6x il24.it 2018-03-03 21:50:00.000 2018-03-03

Created on 2022-05-25 by the reprex package (v2.0.1)

  • Related