My dataset features several blocks, each containing several plots. In each plot, three different lifeforms were marked as present/absent (i.e. 1/0):
Block | Plot | tree | bush | grass |
---|---|---|---|---|
1 | 1 | 0 | 1 | 0 |
1 | 2 | 1 | 1 | 1 |
1 | 3 | 1 | 1 | 1 |
2 | 1 | 0 | 0 | 1 |
2 | 2 | 0 | 0 | 1 |
2 | 3 | 1 | 0 | 1 |
I'm looking for a code that will sum the total number of counts for each distict lifeform at the block level.
I would like an output that resembles this:
Block | tree | bush | grass |
---|---|---|---|
1 | 2 | 3 | 2 |
2 | 1 | 0 | 3 |
I have tried this many ways but the only thing that comes close is:
aggregate(df[,3:5], by = list(df$block), FUN = sum)
However, what this actually returns is:
Block | tree | bush | grass |
---|---|---|---|
1 | 7 | 7 | 7 |
2 | 4 | 4 | 4 |
It appears to be summing all columns together instead of keeping the lifeforms separate.
I feel as though this should be so simple, as there are many queries online about similar processes, but nothing I try has worked.
CodePudding user response:
library(tidyverse)
df %>%
select(-Plot) %>%
pivot_longer(-Block) %>%
group_by(Block, name) %>%
summarise(sum = sum(value)) %>%
pivot_wider(names_from = name, values_from = sum)
# A tibble: 2 × 4
# Groups: Block [2]
Block bush grass tree
<dbl> <dbl> <dbl> <dbl>
1 1 3 2 2
2 2 0 3 1
CodePudding user response:
You were close. Maybe just a typo?
The data frame style
aggregate(df[,3:5], by = list(Block = df$Block), sum)
Block tree bush grass
1 1 2 3 2
2 2 1 0 3
Or a formula style aggregate
aggregate(. ~ Block, df[,-2], sum)
Block tree bush grass
1 1 2 3 2
2 2 1 0 3
With dplyr
library(dplyr)
df %>%
group_by(Block) %>%
summarize(across(tree:grass, sum))
# A tibble: 2 × 4
Block tree bush grass
<int> <int> <int> <int>
1 1 2 3 2
2 2 1 0 3
Data
df <- structure(list(Block = c(1L, 1L, 1L, 2L, 2L, 2L), Plot = c(1L,
2L, 3L, 1L, 2L, 3L), tree = c(0L, 1L, 1L, 0L, 0L, 1L), bush = c(1L,
1L, 1L, 0L, 0L, 0L), grass = c(0L, 1L, 1L, 1L, 1L, 1L)), class =
"data.frame", row.names = c(NA,
-6L))