Home > other >  How to get the averages from a column of a list of data frames?
How to get the averages from a column of a list of data frames?

Time:08-31

I have a list (my.list) that looks like this:

$S1
  Study_ID   B   C         D
1      100 3.4  C1 0.9124000
2      100 1.5 PTA        NA
3      200 1.8  C1 0.5571429
4      200 2.1 PTA 0.7849462
5      300 3.2  C1 0.3271900
6      300 1.4 PTA        NA
7      400 5.6  C1 0.8248200
8      400 9.3 PTA 0.2847020

$S2
  Study_ID    B   C         D
1      100 0.15  C1 0.9124000
2      100 0.70 PTA        NA
3      200 0.23  C1 0.5571429
4      200 0.45 PTA 0.7849462
5      300 0.91  C1 0.3271900
6      300 0.78 PTA 0.6492000
7      400 0.65  C1 0.8248200
8      400 0.56 PTA        NA

I would like to create a data frame that consists of only the average of column 'B' in the lists.

My desired output would look something like this:

  Average
1     2.1
2     1.2
3     0.5
4     1.5
5     1.9
6     2.1
7     3.6
8     5.9

How can I go about doing this?

Reproducible Data:

my.list <- structure(list(S1 = structure(list(Study_ID = c(100, 100, 200, 200, 300,300,400,400), B = c(3.4, 1.5, 1.8, 2.1, 3.2, 1.4, 5.6, 9.3), C = c("C1", "PTA", "C1", "PTA", "C1", "PTA","C1", "PTA"), D = c(0.9124, NA, 0.5571429, 0.7849462, 0.32719, NA, 0.82482, 0.284702)), .Names = c("Study_ID", "B", "C", "D"), class = "data.frame", row.names = c("1", "2", "3", "4", "5", "6", "7", "8")), S2 = structure(list(Study_ID = c(100, 100, 200, 200, 300,300,400,400), B = c(0.15, 0.7, 0.23, 0.45,0.91, 0.78, 0.65, 0.56), C = c("C1", "PTA", "C1", "PTA", "C1", "PTA", "C1", "PTA"), D = c(0.9124, NA, 0.5571429, 0.7849462, 0.32719,0.6492, 0.82482, NA)), .Names = c("Study_ID", "B", "C","D"), class = "data.frame", row.names = c("1", "2", "3", "4", "5", "6", "7", "8"))), .Names = c("S1", "S2"))

CodePudding user response:

We could extract the 'B" column from the list and use Reduce to get the elementwise sum and divide by the length of the list

 Reduce(` `, lapply(my.list, `[[`, "B"))/length(my.list)

Or extract as a matrix and then use rowMeans

rowMeans(sapply(my.list, `[[`, "B"), na.rm = TRUE)
  • Related