Home > Mobile >  calculate new values and reorganize data into a new data table in r
calculate new values and reorganize data into a new data table in r

Time:12-18

I want to write a script in r that may reorganize data on my condition. I want to count number of contig which is greater than 0 and sum num_reads. Both on weekly (week) basis. Please see the expected output

Here is the input data:

contig  num_reads   year    week
h1  67  2012    7
h2  75  2012    7
h3  0   2012    7
h1  3   2012    8
h2  0   2012    8
h3  55  2012    8
h1  32  2012    9
h2  7   2012    9
h3  4   2013    9
h1  67  2013    7
h2  75  2013    7
h3  83  2013    7
h1  3   2013    8
h2  0   2013    8
h3  30  2013    8
h1  32  2013    9
h2  7   2013    9
h3  0   2013    9
h1  67  2014    7
h2  75  2014    7
h3  43  2014    7
h1  3   2014    8
h2  0   2014    8
h3  55  2014    8
h1  32  2014    9
h2  7   2014    9
h3  0   2014    9

Expected output data:

year    week    count_contig    sum_num_reads
2012    7   2   142
2012    8   2   58
2012    9   3   43

and so on

CodePudding user response:

library(tidyverse)

df <- data.frame(
  stringsAsFactors = FALSE,
  contig = c("h1","h2","h3",
             "h1","h2","h3","h1","h2","h3","h1","h2","h3",
             "h1","h2","h3","h1","h2","h3","h1","h2",
             "h3","h1","h2","h3","h1","h2","h3"),
  num_reads = c(67L,75L,0L,3L,
                0L,55L,32L,7L,4L,67L,75L,83L,3L,0L,30L,
                32L,7L,0L,67L,75L,43L,3L,0L,55L,32L,7L,0L),
  year = c(2012L,2012L,2012L,
           2012L,2012L,2012L,2012L,2012L,2013L,2013L,
           2013L,2013L,2013L,2013L,2013L,2013L,2013L,
           2013L,2014L,2014L,2014L,2014L,2014L,2014L,
           2014L,2014L,2014L),
  week = c(7L,7L,7L,8L,8L,
           8L,9L,9L,9L,7L,7L,7L,8L,8L,8L,9L,9L,9L,
           7L,7L,7L,8L,8L,8L,9L,9L,9L)
)

df %>% 
  group_by(year, week) %>% 
  summarise(count_contig = sum(num_reads > 0),
            sum_num_reads = sum(num_reads), .groups = "drop")

#> # A tibble: 9 × 4
#>    year  week count_contig sum_num_reads
#>   <int> <int>        <int>         <int>
#> 1  2012     7            3           142
#> 2  2012     8            3            58
#> 3  2012     9            2            39
#> 4  2013     7            3           225
#> 5  2013     8            3            33
#> 6  2013     9            4            43
#> 7  2014     7            3           185
#> 8  2014     8            3            58
#> 9  2014     9            3            39
  •  Tags:  
  • r
  • Related