I have the following dataframe:
df
name direction to
<chr> <fct> <chr>
1 A -> B
2 A -> X
3 B -> X
4 B -> Y
5 C -> B
6 C -> Y
7 S -> T
8 T -> C
9 W -> Y
10 X -> W
11 Y NA NA
Step 1. I first want to subset the dataframe to only include values that either have X or Y in the columns name
and to
.
df %>% dplyr::select(name,direction,to) %>% filter(name %in% c('X','Y') | to %in% c('X','Y'))
name direction to
<chr> <fct> <chr>
1 A -> X
2 B -> X
3 B -> Y
4 C -> Y
5 W -> Y
6 X -> W
7 Y NA NA
Step 2. From there, I want to get any other connections that match with any of the unique values in name
from df
in Step 1. For example, the unique values in name
are A,B,C,W,X,Y after Step 1. I want to get all observations in the original dataset (without filtering) where any of these values are in the name
column from the original dataset df
. In this example, observations 1 (C->B) and 5 (A->B) from the original dataframe would be added to the subset.
Expected output:
name direction to
<chr> <fct> <chr>
1 A -> X
2 A -> B
3 B -> X
4 B -> Y
5 C -> B
6 C -> Y
7 W -> Y
8 X -> W
9 Y NA NA
Let me know if this doesn't make sense.
CodePudding user response:
I think this should work
df %>% dplyr::select(name,direction,to) %>% filter(name %in% c('X','Y') | to %in% c('X','Y')) -> dfTmp
df[df$name %in% (dfTmp$name),]
CodePudding user response:
We can use if_any
to loop over the 'name', 'to' to return a logical vector, subset the 'name' and create a logical vector with %in%
library(dplyr)
df %>%
filter(name %in% name[if_any(c(name, to), ~ . %in% c('X', 'Y' ))])%>%
as_tibble
-output
# A tibble: 9 × 3
name direction to
<chr> <chr> <chr>
1 A -> B
2 A -> X
3 B -> X
4 B -> Y
5 C -> B
6 C -> Y
7 W -> Y
8 X -> W
9 Y <NA> <NA>
data
df <- structure(list(name = c("A", "A", "B", "B", "C", "C", "S", "T",
"W", "X", "Y"), direction = c("->", "->", "->", "->", "->", "->",
"->", "->", "->", "->", NA), to = c("B", "X", "X", "Y", "B",
"Y", "T", "C", "Y", "W", NA)), class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6", "7", "8", "9", "10", "11"))