Home > database >  How to find number of times each unique group in df has largest value?
How to find number of times each unique group in df has largest value?

Time:09-06

I am working with fish diet data and trying to find the number of fish in which prey item i dominates (has the largest number). I need to do this for each prey item (~25 different prey in my full df). My goal is to be able to compute dominance of each particular prey/food item: Di = (Ndi / N) * 100, where Di = dominance of food item i, Ndi = number of fish in which prey of item i dominates (has the largest number) in the gut content, and N = number of fish examined.

Here is an example:

df1 <- data.frame(preyName = c("a", "b", "c", "d", "e", "e", "b", "a", "e","c","d"), 
           id_uniquefish = c("1", "6", "2", "3", "1","3","4","4","6","6","6"),
           numberPrey = c(14,20,3,19,234,24,4,13,45,4,6))

In this example, prey e dominates in three fish guts examined out of the 5 fish in this example, so my Ndi here is 3 and Di = (3/5) * 100 = 60 for prey e. I need to do this for each preyName (a,b,c,d,e...). My full data has 280 observations.

How can I achieve this in R? I have looked online for a while to no avail, read a few python examples that almost got to what I need but I am not familiar with python so was unable to modify to my needs.

CodePudding user response:

does that work for you?

df1 <- data.frame(preyName = c("a", "b", "c", "d", "e", "e", "b", "a", "e","c","d"), 
                  id_uniquefish = c("1", "6", "2", "3", "1","3","4","4","6","6","6"),
                  numberPrey = c(14,20,3,19,234,24,4,13,45,4,6))

library(data.table)
setDT(df1)

test <- df1[, .SD[which.max(numberPrey)], by = id_uniquefish][, .N, by = preyName]
test[, Di := (N / sum(N)*100)]
test
  • Related