I am having a terrible time figuring out this script. I am working with camera trap data, and have variables for camera name, species, and time of day species was captured on camera (day or night). I want to know for each camera, how many times each species occurred during the day. Ultimately I am looking for a variable, perc.day, of the percent of occurrences during the day for each species for each camera. perc.day would be # of "day" detections / all total detections for each species on each camera. I have been using dplyr and data.table.
Here is a sample of what the data looks like:
camera <- c('A', 'A', 'A', 'A', 'B', 'B', 'B', 'C', 'C', 'C', 'C', 'C')
species <- c('deer', 'deer', 'deer', 'coyote', 'deer', 'deer', 'deer', 'deer', 'coyote', 'coyote', 'coyote', 'bobcat')
time_cat <- c('day', 'day', 'night', 'day', 'night', 'day', 'day', 'day', 'night', 'day', 'night', 'night')
data <- data.frame(camera, species, time_cat)
And what I need it to look like:
camera <- c('A', 'A', 'B', 'C', 'C', 'C')
species <- c('deer', 'coyote', 'deer', 'deer', 'coyote', 'bobcat')
total.detections <- c('3', '1', '3', '1', '3', '1')
perc.day <- c('0.667', '0', '0.667', '100', '0.333', '0')
data.final <- data.frame(camera, species, total.detections, perc.day)
This code works up to the detections variable, but at a loss for how to get the perc.day variable
data.final <-
data %>%
group_by(camera, species) %>%
summarize(total.detections = n())
Thank you for any help or advice.
CodePudding user response:
A final mutate
and ungroup
on top of your original solution does the trick:
data %>%
group_by(camera, species) %>%
summarize(total.detections = n()) %>%
mutate(per.day = total.detections / sum(total.detections)) %>%
ungroup()
# # A tibble: 6 × 4
# camera species total.detections per.day
# <chr> <chr> <int> <dbl>
# 1 A coyote 1 0.333
# 2 A deer 2 0.667
# 3 B deer 2 1
# 4 C bobcat 1 0.333
# 5 C coyote 1 0.333
# 6 C deer 1 0.333
CodePudding user response:
If I understand your question right, you are looking for those camera
and species
" only when time_cat
is day
, that would be better to split the code a little bit:
camera <- c('A', 'A', 'A', 'A', 'B', 'B', 'B', 'C', 'C', 'C', 'C', 'C')
species <- c('deer', 'deer', 'deer', 'coyote', 'deer', 'deer', 'deer', 'deer', 'coyote', 'coyote', 'coyote', 'bobcat')
time_cat <- c('day', 'day', 'night', 'day', 'night', 'day', 'day', 'day', 'night', 'day', 'night', 'night')
data <- data.frame(camera, species, time_cat)
data$total.detections<-1
data$perc.day<-0
data[data$time_cat=="day","perc.day"]<-1
aggregate(total.detections~camera species,data,sum) %>%
mutate(perc.day=aggregate(perc.day~camera species,data=data,sum)$perc.day/total.detections)
camera species total.detections perc.day
1 C bobcat 1 0.0000000
2 A coyote 1 1.0000000
3 C coyote 3 0.3333333
4 A deer 3 0.6666667
5 B deer 3 0.6666667
6 C deer 1 1.0000000