In the 10 rows of data frame below, I have sightings of either whale or vessel and these sightings are grouped by ScanID.
By using the dyplr
library, I am trying to figure out a way to remove the scan without any whales, in this case, it would scan 2 and 5.
I think group_by
would be useful but I am not sure how to proceed from there.
whales <- data.frame(rubbing.beach = c('whale', 'vessel', 'vessel', 'vessel', 'whale', 'whale', 'whale', 'vessel', 'vessel', 'whale'),
ScanID = c(1, 1, 2, 2, 3, 3, 4, 4, 5, 6))
X | Target | ScanID |
---|---|---|
1 | whale | 1 |
2 | vessel | 1 |
3 | vessel | 2 |
4 | vessel | 2 |
5 | whale | 3 |
6 | whale | 3 |
7 | whale | 4 |
8 | vessel | 4 |
9 | vessel | 5 |
10 | whale | 6 |
Leaving me with the output of:
X | Target | ScanID |
---|---|---|
1 | whale | 1 |
2 | vessel | 1 |
3 | whale | 3 |
4 | whale | 3 |
5 | whale | 4 |
6 | vessel | 4 |
7 | whale | 6 |
CodePudding user response:
group_by
is indeed necessary to consider each Scan ID, and filter
is used to specify which rows to keep:
whales = read.table(text =
'X Target ScanID
1 whale 1
2 vessel 1
3 vessel 2
4 vessel 2
5 whale 3
6 whale 3
7 whale 4
8 vessel 4
9 vessel 5
10 whale 6', header = T)
library(dplyr)
whales %>%
group_by(ScanID) %>%
filter("whale" %in% Target)
# # A tibble: 7 × 3
# # Groups: ScanID [4]
# X Target ScanID
# <int> <chr> <int>
# 1 1 whale 1
# 2 2 vessel 1
# 3 5 whale 3
# 4 6 whale 3
# 5 7 whale 4
# 6 8 vessel 4
# 7 10 whale 6
CodePudding user response:
I think you can do this without group_by
by extracting all the ScanID
with rubbing.beach == "whale"
and use it in subset
.
subset(whales, ScanID %in% unique(ScanID[rubbing.beach == "whale"]))
# rubbing.beach ScanID
#1 whale 1
#2 vessel 1
#3 whale 3
#4 whale 3
#5 whale 4
#6 vessel 4
#7 whale 6
In dplyr
, we can use filter
-
library(dplyr)
whales %>% filter(ScanID %in% unique(ScanID[rubbing.beach == "whale"]))