Each ID variable has multiple rows, and I'd like to create a vector that tells me if any of the runs (rows within that id) contains "orange."
Otherwise, I'd like it to return "apple" if "orange" is not contained on any of the rows for that id.
I'm guessing it's something like
data_desired <- data %>%
group_by("ID") %>%
mutate(AnyOrange = ...)
but that's where I'm stuck...sample data and desired outcome below:
library(tidyverse)
data <- tribble(
~ID, ~Run, ~Oranges,
#--/---/---
"a", 1, "orange",
"a", 2, "orange",
"b", 1, "apple",
"b", 2, "apple",
"b", 3, "orange",
"c", 1, "apple",
"c", 2, "apple"
)
# Desired Outcome
data <- tribble(
~ID, ~Run, ~Oranges, ~AnyOrange,
#--/---/---/---
"a", 1, "orange","orange",
"a", 2, "orange","orange",
"b", 1, "apple","orange",
"b", 2, "apple","orange",
"b", 3, "orange","orange",
"c", 1, "apple","apple",
"c", 2, "apple","apple"
)
CodePudding user response:
data %>%
group_by(ID) %>%
mutate(AnyOrange = ifelse(any(Oranges=='orange'), 'orange', Oranges))
# A tibble: 7 x 4
# Groups: ID [3]
ID Run Oranges AnyOrange
<chr> <dbl> <chr> <chr>
1 a 1 orange orange
2 a 2 orange orange
3 b 1 apple orange
4 b 2 apple orange
5 b 3 orange orange
6 c 1 apple apple
7 c 2 apple apple
CodePudding user response:
The column names should be unquoted within the tidyverse functions. Otherwise, after grouping by 'ID', match
for 'orange' to get the index of first 'orange' value, use it to subset the 'Oranges' and then coalesce
with the original 'Oranges' column
library(dplyr)
data %>%
group_by(ID) %>%
mutate(AnyOrange = coalesce(Oranges[match('orange', Oranges)], Oranges)) %>%
ungroup
-output
# A tibble: 7 × 4
ID Run Oranges AnyOrange
<chr> <dbl> <chr> <chr>
1 a 1 orange orange
2 a 2 orange orange
3 b 1 apple orange
4 b 2 apple orange
5 b 3 orange orange
6 c 1 apple apple
7 c 2 apple apple
CodePudding user response:
Here is an alternative dplyr
approach:
Basically it is similar to @onyambu's solution. Here we use %in%
operator:
data %>%
group_by(ID) %>%
mutate(AnyOrange = ifelse("orange" %in% Oranges, "orange","apple"))
ID Run Oranges AnyOrange
<chr> <dbl> <chr> <chr>
1 a 1 orange orange
2 a 2 orange orange
3 b 1 apple orange
4 b 2 apple orange
5 b 3 orange orange
6 c 1 apple apple
7 c 2 apple apple