Home > Back-end >  Obtaining columns from lists of data frames
Obtaining columns from lists of data frames

Time:06-04

I'm trying to conduct some data transformation on a dataset with a somewhat messy structure.

Working with a list of data frames and I'm trying to extract the second column from each. In a perfect world, it'd be something straightforward like this.

M1 <- data.frame(matrix(1:2, nrow = 1, ncol = 2))
M2 <- data.frame(matrix(9:10, nrow = 1, ncol = 2))
M3 <- data.frame(matrix(20:23, nrow=2, ncol =2))

list_a<-list(M1, M2, M3)
output1<-lapply(list_a, "[",,2)

But not every dataframe has two columns. Some are missing, as below.

M4 <- data.frame(matrix(nrow = 0, ncol = 0))
list_b<-list(M1, M2, M3, M4)

Filtering out these problem columns is also pretty straightforward, which then allows me to run the lapply above.

filtered<-Filter(function(x) ncol(x)>0, list_b)

However, what I really would like to do is preserve those rows with less than two columns as an NA in the output. I've mostly been attempting ifelse statements but they have not been successful.

CodePudding user response:

This?

lapply(list_b, function(z) if (ncol(z) > 1) z[,2,drop=FALSE] else NA)
# [[1]]
#   X2
# 1  2
# [[2]]
#   X2
# 1 10
# [[3]]
#   X2
# 1 22
# 2 23
# [[4]]
# [1] NA

CodePudding user response:

Use an if/else condition

lapply(list_b, \(x) if(ncol(x) < 2) NA else x[[2]])

-output

[[1]]
[1] 2

[[2]]
[1] 10

[[3]]
[1] 22 23

[[4]]
[1] NA

pluck can still work when there are no columns i.e. by default, it returns NULL, we could change it to NA

library(purrr)
map(list_b, pluck, 2, .default = NA)
[[1]]
[1] 2

[[2]]
[1] 10

[[3]]
[1] 22 23

[[4]]
[1] NA
  •  Tags:  
  • r
  • Related