Home > database >  Apply an equation on a dataframe order by date in r
Apply an equation on a dataframe order by date in r

Time:10-03

This is how my data looks like:

structure(list(ID = c("h01", "h01", "h01", "h01", "h01"), collection_date = structure(c(15076, 
15076, 15092, 15092, 15125), class = "Date"), wavelength = c(630L, 
800L, 630L, 800L, 630L), R = c(0.078, 0.295, 0.108, 0.361, 0.127
)), row.names = c(NA, 5L), class = "data.frame")

What I'm want to do (but have no idea) is to apply the following equation: "(R of the wavelength 800 - R of the wavelength 630) / (R of the wavelength 800 R of the wavelength 630)" per each individual collection_date and return the result in an individual dataframe.

Any help will be much appreciated.

CodePudding user response:

We arrange the data by 'ID', 'collection_date', 'wavelength', then grouped by 'ID', 'collection_date', get the diff of 'R' and divide by the sum (assuming only 800 and 630 wavelength present). If there is only a single observation, return NA

library(dplyr)
df1 %>%
     filter(wavelength %in% c(800, 630)) %>% # in case there are other wav
     arrange(ID, collection_date, wavelength) %>%
     group_by(ID, collection_date) %>% 
     summarise(new = if(n() == 1) NA_real_ else diff(R)/sum(R), .groups = 'drop')

-output

# A tibble: 3 × 3
  ID    collection_date    new
  <chr> <date>           <dbl>
1 h01   2011-04-12       0.582
2 h01   2011-04-28       0.539
3 h01   2011-05-31      NA    

CodePudding user response:

You may write a function to perform the calculation with the help of match and apply if for each ID and collection_date.

library(dplyr)

result_calc <- function(wave, R) {
  r1 <- R[match(630, wave)]
  r2 <- R[match(800, wave)]
  (r2 - r1)/(r2   r1)
}

df %>%
  group_by(ID, collection_date) %>%
  summarise(result = result_calc(wavelength, R), .groups = 'drop')

#   ID    collection_date result
#  <chr> <date>           <dbl>
#1 h01   2011-04-12       0.582
#2 h01   2011-04-28       0.539
#3 h01   2011-05-31      NA    
  • Related