Home > Software engineering >  Find maximum value within a year?
Find maximum value within a year?

Time:11-24

My data look like this

Dat:

         V1         V2
       1  5 1981-01-02
       2  1 1981-01-09
       3  2 1981-01-12
       4  2 1981-01-15
       5  4 1981-01-19
       6 17 1981-01-25

str(Dat)

   data.frame': 2181 obs. of  2 variables:
   $ V1: num  5 1 2 2 4 17 7 10 1 1 ...
   $ V2: Date, format: "1981-01-02" "1981-01-09" ...

I want to extract the maximum value in V1 within a year

I want also to count how many values in V1 exceed 10 within a year.

Data:

       dput(head(Dat))
       structure(list(V1 = c(5, 1, 2, 2, 4, 17), V2 = 
       structure(c(4019, 
       4026, 4029, 4032, 4036, 4042), class = "Date")), row.names = 
        c(NA, 
        6L), class = "data.frame")

Expected output:

        Year    max_value    number of values in V1 exceed 10
         1981     .              .

CodePudding user response:

Something like this, using dplyr and lubridate.

library(dplyr)
library(lubridate)

Dat %>% 
  mutate(Year = year(V2)) %>% 
  group_by(Year) %>% 
  summarise(Max = max(V1), 
  Count = sum(V1 > 10))

Result (using your example data):

# A tibble: 1 × 3
   Year   Max Count
  <dbl> <int> <int>
1  1981    17     1
  •  Tags:  
  • r
  • Related