Home > OS >  Group by/aggregate multiple variables
Group by/aggregate multiple variables

Time:12-09

Aggregate results from a table in R

I'm pretty sure the answer is simple, but I'm having trouble to properly describe the problem...

I have the following table:

Species Round # of individuals
A 1 3
A 1 2
B 1 2
A 2 2

I have two expected outputs: one which shows the sum of individuals, and other which shows the number of occurences. Both per Round, grouped by Species, as it follows:

# individuals Round 1 Round 2
A 5 2
B 2 0
# Occurences Round 1 Round 2
A 2 1
B 1 0

I think it must be done with a combination of AGGREGATE and GROUP BY, but couldn't find the exact sintax. Any suggestions?

Thanks!

CodePudding user response:

Here is a base R solution.
aggregate the individuals by all other variables, applying sum and length in the 1st and 2nd cases, respectively. Then reshape to wide format.

df1 <- "Species     Round   'individuals'
A   1   3
A   1   2
B   1   2
A   2   2"
df1 <- read.table(textConnection(df1), header = TRUE)

agg1 <- aggregate(individuals ~ ., df1, sum)
agg1
#>   Species Round individuals
#> 1       A     1           5
#> 2       B     1           2
#> 3       A     2           2

agg2 <- aggregate(individuals ~ ., df1, length)
agg2
#>   Species Round individuals
#> 1       A     1           2
#> 2       B     1           1
#> 3       A     2           1

reshape(agg1, direction = "wide", idvar = "Species", timevar = "Round")
#>   Species individuals.1 individuals.2
#> 1       A             5             2
#> 2       B             2            NA

reshape(agg2, direction = "wide", idvar = "Species", timevar = "Round")
#>   Species individuals.1 individuals.2
#> 1       A             2             1
#> 2       B             1            NA

Created on 2022-12-08 with reprex v2.0.2

  •  Tags:  
  • r
  • Related