I'm trying to make all dataframes in a list have the same number of columns.
Create a list of 3 dataframes, but the 2nd has 1 extra column.
my_data<-
list( data.frame(
V1= c(1,1,1,1,1),
V2= c(2,2,2,2,2),
V3= c(3,3,3,3,3),
V4= c(4,4,4,4,4),
V5= c(5,5,5,5,5)),
data.frame(
V1= c(1,1,1,1,1),
V2= c(2,2,2,2,2),
V3= c(3,3,3,3,3),
V4= c(4,4,4,4,4),
V5= c(5,5,5,5,5),
V6= c(6,6,6,6,6)),
data.frame(
V1= c(1,1,1,1,1),
V2= c(2,2,2,2,2),
V3= c(3,3,3,3,3),
V4= c(4,4,4,4,4),
V5= c(5,5,5,5,5))
manual removal of column: if df[[2]] has > 5 columns, remove the 6th
if (ncol(my_data[[2]])>5) {
my_data[[2]][,-6]
}
But why doesn't the same logic doesn't work when looping it through the list?
for (i in 1:length(my_data)) {
if (ncol(my_data[[i]])>5) {
my_data[[i]][,-6]
}
}
CodePudding user response:
Your logic works just fine. When you are iterating through the loop, you have to assign the updated frame back to that element of the list.
Simply replace:
my_data[[i]][,-6]
with
my_data[[i]]<-my_data[[i]][,-6]
within the if
clause.
CodePudding user response:
Get the min
imum number of columns from all the data.frames in the list and then use that information in the for
loop, do the assignment (<-
) to update the data.frame elements in the list
n <- min(sapply(my_data, ncol))
for(i in seq_along(my_data)) my_data[[i]] <- my_data[[i]][seq_len(n)]
CodePudding user response:
If you want to keep the columns with the same names (no matter of the order), then you get the common column names, and select those:
selected_cols <- Reduce(intersect, lapply(my_data, names))
my_data <- lapply(my_data, function(x) x[selected_cols, ])