Home > Back-end >  Excluding complete days without a specific value
Excluding complete days without a specific value

Time:04-27

I have data frame in R, which looks as follows:

quotedate days
2002-01-01 12
2002-01-01 22
2002-01-01 32
2002-01-01 42
2002-01-03 2
2002-01-03 5
2002-01-03 12
2002-01-03 21
2002-01-05 12
2002-01-05 212
2002-01-05 32
2002-01-05 142
2002-01-07 112
2002-01-07 122
2002-01-07 132
2002-01-07 142

How can I exclude days, which have not 30 days within their days' range.

In the above exapmle I want to keep 2002-01-01 and 2002-01-05.

Do you have any suggestions?

Thank you!

CodePudding user response:

You can use group_by and filter:

library(tidyverse)

df %>%
  group_by(quotedate) %>%
  filter(max(days) >= 30)

    quotedate days
1  2002-01-01   12
2  2002-01-01   22
3  2002-01-01   32
4  2002-01-01   42
5  2002-01-05   12
6  2002-01-05  212
7  2002-01-05   32
8  2002-01-05  142
9  2002-01-07  112
10 2002-01-07  122
11 2002-01-07  132
12 2002-01-07  142

CodePudding user response:

Creating a boolean subset using max in ave.

dat[with(dat, ave(days, quotedate, FUN=max) >= 30), ]
#     quotedate days
# 1  2002-01-01   12
# 2  2002-01-01   22
# 3  2002-01-01   32
# 4  2002-01-01   42
# 9  2002-01-05   12
# 10 2002-01-05  212
# 11 2002-01-05   32
# 12 2002-01-05  142
# 13 2002-01-07  112
# 14 2002-01-07  122
# 15 2002-01-07  132
# 16 2002-01-07  142 

Data:

dat <- structure(list(quotedate = c("2002-01-01", "2002-01-01", "2002-01-01", 
"2002-01-01", "2002-01-03", "2002-01-03", "2002-01-03", "2002-01-03", 
"2002-01-05", "2002-01-05", "2002-01-05", "2002-01-05", "2002-01-07", 
"2002-01-07", "2002-01-07", "2002-01-07"), days = c(12L, 22L, 
32L, 42L, 2L, 5L, 12L, 21L, 12L, 212L, 32L, 142L, 112L, 122L, 
132L, 142L)), class = "data.frame", row.names = c(NA, -16L))
  •  Tags:  
  • r
  • Related