My data look like this
Dat:
V1 V2
1 5 1981-01-02
2 1 1981-01-09
3 2 1981-01-12
4 2 1981-01-15
5 4 1981-01-19
6 17 1981-01-25
str(Dat)
data.frame': 2181 obs. of 2 variables:
$ V1: num 5 1 2 2 4 17 7 10 1 1 ...
$ V2: Date, format: "1981-01-02" "1981-01-09" ...
I want to extract the maximum value in V1 within a year
I want also to count how many values in V1 exceed 10 within a year.
Data:
dput(head(Dat))
structure(list(V1 = c(5, 1, 2, 2, 4, 17), V2 =
structure(c(4019,
4026, 4029, 4032, 4036, 4042), class = "Date")), row.names =
c(NA,
6L), class = "data.frame")
Expected output:
Year max_value number of values in V1 exceed 10
1981 . .
CodePudding user response:
Something like this, using dplyr
and lubridate
.
library(dplyr)
library(lubridate)
Dat %>%
mutate(Year = year(V2)) %>%
group_by(Year) %>%
summarise(Max = max(V1),
Count = sum(V1 > 10))
Result (using your example data):
# A tibble: 1 × 3
Year Max Count
<dbl> <int> <int>
1 1981 17 1