I am trying to nest if and/or ifelse statements together in R and I can't quite get the syntax correct. I would like to perform some basic arithmetic: Referencing a similar dataset below, if the date is the same and if the location code is the same, I would like to subtract the pH values of codes A and B from the corresponding pH value for code F and enter the result into pHDelta. Or written out, A - F for a given date and location.
Thank you!
I am not certain the daset that I created below will appear correctly, please pardon me if it does not.
My dataset is similar to the following:
Date | Location | Code | pH | pHDelta |
---|---|---|---|---|
22/07/01 | AA | A | 7.1 | |
22/07/01 | AA | B | 6.8 | |
22/07/01 | AA | F | 8.2 | |
22/07/01 | AB | A | 7. 2 | |
22/07/01 | AB | B | 7.8 | |
22/07/01 | AB | F | 8.4 | |
22/07/01 | AC | A | 7.5 | |
22/07/01 | AC | B | 6.2 | |
22/07/01 | AC | F | 8.3 | |
22/07/01 | AD | A | 7.1 | |
22/07/01 | AD | B | 6.8 | |
22/07/01 | AD | F | 8.2 | |
22/07/02 | AA | A | 7.1 | |
22/07/02 | AA | B | 6.8 | |
22/07/02 | AA | F | 8.2 | |
22/07/02 | AB | A | 7.2 | |
22/07/02 | AB | B | 7.8 | |
22/07/02 | AB | F | 8.4 | |
22/07/02 | AC | A | 7.5 | |
22/07/02 | AC | B | 6.2 | |
22/07/02 | AC | F | 8.3 | |
22/07/02 | AD | A | 7.1 | |
22/07/02 | AD | B | 6.8 | |
22/07/02 | AD | F | 8.2 |
CodePudding user response:
We can use a group by approach - grouped by 'Date', 'Location', subset the 'pH' where 'Code' value is "F" (assuming only a single "F" per Location) and then subtract from the 'pH' column
library(dplyr)
df1 <- df1 %>%
group_by(Date, Location) %>%
mutate(phDelta = pH - pH[Code == "F"]) %>%
ungroup
CodePudding user response:
Another approach: Because of the repetitive calculations we could also do:
library(dplyr)
df %>%
group_by(Date, Location) %>%
arrange(Code, .by_group = TRUE) %>%
mutate(pHDelta = pH-last(pH)) %>%
ungroup()
Date Location Code pH pHDelta
<chr> <chr> <chr> <dbl> <dbl>
1 22/07/01 AA A 7.1 -1.1
2 22/07/01 AA B 6.8 -1.4
3 22/07/01 AA F 8.2 0
4 22/07/01 AB A 7.2 -1.2
5 22/07/01 AB B 7.8 -0.600
6 22/07/01 AB F 8.4 0
7 22/07/01 AC A 7.5 -0.800
8 22/07/01 AC B 6.2 -2.1
9 22/07/01 AC F 8.3 0
10 22/07/01 AD A 7.1 -1.1
# ... with 14 more rows
data:
structure(list(Date = c("22/07/01", "22/07/01", "22/07/01", "22/07/01",
"22/07/01", "22/07/01", "22/07/01", "22/07/01", "22/07/01", "22/07/01",
"22/07/01", "22/07/01", "22/07/02", "22/07/02", "22/07/02", "22/07/02",
"22/07/02", "22/07/02", "22/07/02", "22/07/02", "22/07/02", "22/07/02",
"22/07/02", "22/07/02"), Location = c("AA", "AA", "AA", "AB",
"AB", "AB", "AC", "AC", "AC", "AD", "AD", "AD", "AA", "AA", "AA",
"AB", "AB", "AB", "AC", "AC", "AC", "AD", "AD", "AD"), Code = c("A",
"B", "F", "A", "B", "F", "A", "B", "F", "A", "B", "F", "A", "B",
"F", "A", "B", "F", "A", "B", "F", "A", "B", "F"), pH = c(7.1,
6.8, 8.2, 7.2, 7.8, 8.4, 7.5, 6.2, 8.3, 7.1, 6.8, 8.2, 7.1, 6.8,
8.2, 7.2, 7.8, 8.4, 7.5, 6.2, 8.3, 7.1, 6.8, 8.2)), class = "data.frame", row.names = c(NA,
-24L))