Home > OS >  How Can I Aggregate A Column of Strings By Frequency [duplicate]
How Can I Aggregate A Column of Strings By Frequency [duplicate]

Time:09-26

Suppose I have the following data frame formatted as such:

     x     y
    2001 Apples
    2001 Apples
    2001 Apples
    2001 Oranges
    2001 Oranges
    2002 Apples
    2002 Apples
    2002 Apples
    2002 Apples
    2002 Oranges
    2002 Oranges
    2002 Oranges

How could I combine this aggregate this data so the result would be like this:

     x     y      Frequency
    2001 Apples       3
    2001 Oranges      2
    2002 Oranges      3
    2002 Apples       4

I know that tables are good for showing frequency, but I am not sure how to aggregate this data? I have tried doing something like aggregate(df1$x ~ df1$y, df1, FUN = sum), but that did not yield the expected results.

CodePudding user response:

A tidyverse approach:

x <- c(2001,2001,2001,2001,2001,2002,2002,2002,2002,2002,2002,2002)
y <- c("Apples","Apples","Apples","Oranges","Oranges",
       "Apples","Apples","Apples","Oranges","Oranges","Oranges","Oranges")

df <- tibble(x = x, y = y)

library(dplyr)

df %>% 
  count(x,y,name = "Frequency")

# A tibble: 4 x 3
      x y       Frequency
  <dbl> <chr>       <int>
1  2001 Apples          3
2  2001 Oranges         2
3  2002 Apples          3
4  2002 Oranges         4

CodePudding user response:

with base R functions;

df1 <- read.table(textConnection('x     y
    2001 Apples
    2001 Apples
    2001 Apples
    2001 Oranges
    2001 Oranges
    2002 Apples
    2002 Apples
    2002 Apples
    2002 Apples
    2002 Oranges
    2002 Oranges
    2002 Oranges'),header=T)

data.frame(table(df1))

output;

  x     y        Freq
  <fct> <fct>   <int>
1 2001  Apples      3
2 2002  Apples      4
3 2001  Oranges     2
4 2002  Oranges     3
  • Related