How to create columns with row counts in R?-CodePudding

I have a dataset where I am trying to count the values across rows, and then create different columns for the counts for each possible variable. I have a large dataset like this (thousands of rows):

Date	Lat	Long	Station	Tow	Sex
1990	y1	x1	1	1	1
1990	y1	x1	1	1	1
1990	y1	x1	1	1	2
1990	y1	x1	1	1	2
1990	y1	x1	1	1	2
1990	y2	x2	1	2	1
1990	y2	x2	1	2	1
1990	y2	x2	1	2	1
1990	y2	x2	1	2	1
1990	y2	x2	1	2	2

And I want to create a table like this in R

Date	Lat	Long	Station	Tow	CountSex1	CountSex2
1990	y1	x1	1	1	2	3
1990	y1	x1	1	2	4	1

I tried using group_by and summarize in tidyverse, but wasn't getting what I needed. I'm still really new to tidyverse and I cannot figure out what functions/combination of functions to use to get quite what I'm looking for.

What I tried:

countsex <- df %>% 
  group_by(Date, Lat, Long, Station, Tow, Sex) %>% 
  summarise(count_sex=n())

And I got:

Date	Lat	Long	Station	Tow	Sex	CountSex
1990	y1	x1	1	1	1	2
1990	y1	x1	1	1	2	3
1990	y2	x2	1	2	1	4
1990	y2	x2	1	2	2	1

which isn't what I'm looking for, and I couldn't figure out how to transform this to get what I needed.

CodePudding user response：

We can use pivot_wider

library(dplyr)
library(tidyr)
df %>%
   pivot_wider(names_from = Sex, values_from = Sex, 
       values_fn = length, names_prefix = 'CountSex')

-output

# A tibble: 2 × 7
   Date Lat   Long  Station   Tow CountSex1 CountSex2
  <int> <chr> <chr>   <int> <int>     <int>     <int>
1  1990 y1    x1          1     1         2         3
2  1990 y2    x2          1     2         4         1

data

df <- structure(list(Date = c(1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 
1990L, 1990L, 1990L, 1990L), Lat = c("y1", "y1", "y1", "y1", 
"y1", "y2", "y2", "y2", "y2", "y2"), Long = c("x1", "x1", "x1", 
"x1", "x1", "x2", "x2", "x2", "x2", "x2"), Station = c(1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), Tow = c(1L, 1L, 1L, 1L, 1L, 
2L, 2L, 2L, 2L, 2L), Sex = c(1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 
1L, 2L)), class = "data.frame", row.names = c(NA, -10L))

CodePudding user response：

Slightly different approach also using pivot_wider: Note Sex in count:

library(dplyr)
library(tidyr)

df %>% 
  count(Date, Lat, Long, Station, Tow,Sex = paste0("CountSex", Sex)) %>% 
  pivot_wider(names_from = Sex, values_from = n)

   Date Lat   Long  Station   Tow CountSex1 CountSex2
  <int> <chr> <chr>   <int> <int>     <int>     <int>
1  1990 y1    x1          1     1         2         3
2  1990 y2    x2          1     2         4         1