Suppose I have the following data frame formatted as such:
x y
2001 Apples
2001 Apples
2001 Apples
2001 Oranges
2001 Oranges
2002 Apples
2002 Apples
2002 Apples
2002 Apples
2002 Oranges
2002 Oranges
2002 Oranges
How could I combine this aggregate this data so the result would be like this:
x y Frequency
2001 Apples 3
2001 Oranges 2
2002 Oranges 3
2002 Apples 4
I know that tables are good for showing frequency, but I am not sure how to aggregate this data? I have tried doing something like aggregate(df1$x ~ df1$y, df1, FUN = sum)
, but that did not yield the expected results.
CodePudding user response:
A tidyverse
approach:
x <- c(2001,2001,2001,2001,2001,2002,2002,2002,2002,2002,2002,2002)
y <- c("Apples","Apples","Apples","Oranges","Oranges",
"Apples","Apples","Apples","Oranges","Oranges","Oranges","Oranges")
df <- tibble(x = x, y = y)
library(dplyr)
df %>%
count(x,y,name = "Frequency")
# A tibble: 4 x 3
x y Frequency
<dbl> <chr> <int>
1 2001 Apples 3
2 2001 Oranges 2
3 2002 Apples 3
4 2002 Oranges 4
CodePudding user response:
with base
R functions;
df1 <- read.table(textConnection('x y
2001 Apples
2001 Apples
2001 Apples
2001 Oranges
2001 Oranges
2002 Apples
2002 Apples
2002 Apples
2002 Apples
2002 Oranges
2002 Oranges
2002 Oranges'),header=T)
data.frame(table(df1))
output;
x y Freq
<fct> <fct> <int>
1 2001 Apples 3
2 2002 Apples 4
3 2001 Oranges 2
4 2002 Oranges 3