I'm trying to summarize a dataset based on "station" and "depth bin" with total counts of family for each. This is how the dataset looks:
The end result should look like this"
...
CodePudding user response:
Using dplyr
,
Data
df <- read.table(text = "Family Station 'Total Count' 'Depth Bin'
Macrouridae 1504-04 1 2500-2550
Ophidiidae 1504-04 1 3500-3550
Synaphobranchidae 1504-05 2 3000-3050", header= TRUE)
Code
library(dplyr)
library(tidyr)
df %>%
group_by(Family,Station, Depth.Bin) %>%
summarise(n = sum(Total.Count)) %>%
mutate(newcol = paste0(c(Station, Depth.Bin), collapse = ":")) %>%
ungroup() %>%
select(Family, n, newcol) %>%
mutate(row = row_number()) %>%
pivot_wider(names_from = newcol, values_from = n) %>%
select(-row)
Family `1504-04:2500-2550` `1504-04:3500-3550` `1504-05:3000-3050`
<chr> <int> <int> <int>
1 Macrouridae 1 NA NA
2 Ophidiidae NA 1 NA
3 Synaphobranchidae NA NA 2
CodePudding user response:
Base-R version, with tapply
(I changed some of your variable names to avoid spaces):
dd <- read.table(header = TRUE, text = "
Family Station Total_Count Depth_Bin
Macrouridae 1504-04 1 2500-2550
Ophidiidae 1504-04 1 3500-3550
Synaphobranchidae 1504-05 2 3000-3050
")
with(dd, tapply(
Total_Count,
list(Family, interaction(Station, Depth_Bin, sep = ":")),
FUN = sum))