Home > Software engineering >  Summarizing by multiple groups in R
Summarizing by multiple groups in R

Time:11-10

I'm trying to summarize a dataset based on "station" and "depth bin" with total counts of family for each. This is how the dataset looks:

enter image description here

The end result should look like this"

enter image description here

...

CodePudding user response:

Using dplyr,

Data

df <- read.table(text = "Family              Station    'Total Count'   'Depth Bin'
Macrouridae         1504-04          1         2500-2550
Ophidiidae          1504-04          1         3500-3550
Synaphobranchidae   1504-05          2         3000-3050", header=  TRUE)

Code

library(dplyr)
library(tidyr)

df %>%
  group_by(Family,Station, Depth.Bin) %>%
  summarise(n = sum(Total.Count)) %>%
  mutate(newcol = paste0(c(Station, Depth.Bin), collapse = ":")) %>%
  ungroup() %>%
  select(Family, n, newcol) %>%
  mutate(row = row_number()) %>%
  pivot_wider(names_from = newcol, values_from = n) %>%
  select(-row)

  Family            `1504-04:2500-2550` `1504-04:3500-3550` `1504-05:3000-3050`
  <chr>                           <int>               <int>               <int>
1 Macrouridae                         1                  NA                  NA
2 Ophidiidae                         NA                   1                  NA
3 Synaphobranchidae                  NA                  NA                   2

CodePudding user response:

Base-R version, with tapply (I changed some of your variable names to avoid spaces):

dd <- read.table(header = TRUE, text = "
Family              Station    Total_Count   Depth_Bin
Macrouridae         1504-04          1         2500-2550
Ophidiidae          1504-04          1         3500-3550
Synaphobranchidae   1504-05          2         3000-3050
")

with(dd, tapply(
           Total_Count, 
           list(Family, interaction(Station, Depth_Bin, sep = ":")), 
           FUN = sum))
  •  Tags:  
  • r
  • Related