Home > database >  How to determine number of values outside of range by group
How to determine number of values outside of range by group

Time:04-28

I am having trouble calculating the number of values at Site "Out" in a that fall outside the range of values at each Plot at Site "In". So, how many values of Plot C are < b$Min or > b$Max of Plot A and Plot B. I need a separate value for each "In" Plot.

a <- data.frame (Site = c('Out','Out','Out','In','In','In','In','In','In'),
                 Plot = c('C','C','C','A','A','A','B','B','B'),
                 Value = c(.120,.400,.700,.144,.500,.688,.102,.500,.678))

b <- data.frame(Site = c('Out','In','In'),
               Plot = c('C','A','B'),
               Min = c(.120,.144,.102),
               Max = c(.700,.688,.678))

The desired output would be an additional column in b with the number of Plot C values outside the range.

|Site|Plot|Min |Max |N|
|----|----|----|----|-| 
|Out |C   |.120|.700|0|   
|In  |A   |.144|.688|2|      
|In  |B   |.102|.678|1|

CodePudding user response:

You can use the following approach:

inner_join(a %>% filter(Site=="Out") %>% mutate(Site="In"),
           b %>% filter(Site=="In"), by="Site") %>% 
  filter(data.table::between(Value, Min, Max)) %>% 
  count(Plot.x, Plot.y)

Output:

  Plot.x Plot.y     n
  <chr>  <chr>  <int>
1 C      A          1
2 C      B          2

If you don't want to use data.table::between(), you may use dplyr::between(), but rowwise()

  • Related