How do you make all dataframes in a list have the same number of columns?-CodePudding

I'm trying to make all dataframes in a list have the same number of columns.

Create a list of 3 dataframes, but the 2nd has 1 extra column.

    my_data<- 
        list(    data.frame(
         V1= c(1,1,1,1,1),
         V2= c(2,2,2,2,2),
         V3= c(3,3,3,3,3),
         V4= c(4,4,4,4,4),
         V5= c(5,5,5,5,5)), 
       data.frame(
         V1= c(1,1,1,1,1),
         V2= c(2,2,2,2,2),
         V3= c(3,3,3,3,3),
         V4= c(4,4,4,4,4),
         V5= c(5,5,5,5,5),
         V6= c(6,6,6,6,6)),
       data.frame(
         V1= c(1,1,1,1,1),
         V2= c(2,2,2,2,2),
         V3= c(3,3,3,3,3),
         V4= c(4,4,4,4,4),
         V5= c(5,5,5,5,5))

manual removal of column: if df[[2]] has > 5 columns, remove the 6th

if (ncol(my_data[[2]])>5) {
my_data[[2]][,-6]
}

But why doesn't the same logic doesn't work when looping it through the list?

for (i in 1:length(my_data)) {

 if (ncol(my_data[[i]])>5) {
my_data[[i]][,-6]
 } 
}

CodePudding user response：

Your logic works just fine. When you are iterating through the loop, you have to assign the updated frame back to that element of the list.

Simply replace:

my_data[[i]][,-6]

with

my_data[[i]]<-my_data[[i]][,-6]

within the if clause.

CodePudding user response：

Get the minimum number of columns from all the data.frames in the list and then use that information in the for loop, do the assignment (<-) to update the data.frame elements in the list

n <- min(sapply(my_data, ncol))
for(i in seq_along(my_data)) my_data[[i]] <- my_data[[i]][seq_len(n)]

CodePudding user response：

If you want to keep the columns with the same names (no matter of the order), then you get the common column names, and select those:

selected_cols <- Reduce(intersect, lapply(my_data, names))
my_data <- lapply(my_data, function(x) x[selected_cols, ])