Problem
I have a list of data frames. All of the dataframes have the same column names, but different numbers of rows. One column, called pred
has the following four factors.
- Apple
- Cherry
- Orange
- Pear
I wish to count how many rows are 'apple', 'cherry' etc.
If I single out one dataframe (dataframe1), and perform the counts using:
count(dataframe1, pred)
I get the desired output:
pred n
<fctr> <int>
Apple 25
Orange 11
Pear 11
Cherry 12
This is how I would like the output for mutliple dataframes that are contained within a list. How can this be achieved? I have tried various options using the dyplr package, but tend to get the error.
'Error in UseMethod("count") : no applicable method for 'count' applied to an object of class "list"'
CodePudding user response:
If all data.frame have the same columns you can use this:
library(dplyr)
data1 <-data.frame(pred = sample(c("Apple","Cherry","Orange","Pear"),100,replace = TRUE))
data2 <-data.frame(pred = sample(c("Apple","Cherry","Orange","Pear"),100,replace = TRUE))
data3 <-data.frame(pred = sample(c("Apple","Cherry","Orange","Pear"),100,replace = TRUE))
list_of_dataframes <- list(data1,data2,data3)
bind_rows(list_of_dataframes,.id = "data") %>%
count(data,pred)
CodePudding user response:
With lapply
you may use table
and coerce it as.data.frame
.
lapply(dat, \(x) as.data.frame(table(x$pred, dnn='pred')))
# [[1]]
# pred Freq
# 1 Apple 5
# 2 Cherry 2
# 3 Orange 3
#
# [[2]]
# pred Freq
# 1 Apple 5
# 2 Cherry 4
# 3 Orange 1
# 4 Pear 5
#
# [[3]]
# p red Freq
# 1 Apple 3
# 2 Cherry 5
# 3 Orange 7
# 4 Pear 3
Data:
set.seed(42)
dat <- lapply(c(10, 15, 18), \(x)
data.frame(
pred=sample(c('Apple', 'Orange', 'Pear', 'Cherry'), x, replace=TRUE),
x=runif(x))
)