Home > Software engineering >  R- for loop to select two columns in a data frame, with only the second column changing
R- for loop to select two columns in a data frame, with only the second column changing

Time:10-19

I'm having issues trying to write a for loop in R. I have a dataframe of 16 columns and 94 rows and i want to loop through, selecting column 1, plus column 2 in one data frame, then col 1 col 3 etc, so i end up with 16 dataframes containing 2 columns, all written to individual .csv files

TwoB<- read.csv("data.csv", header=F) 

list<- lapply(1:nX, function(x) NULL)


nX <- ncol(TwoB)

for(i in 1:ncol(TwoB)){
list[[i]]<-subset(TwoB,
                 select=c(1, i 1))
 }

Which produces an error:

 Error in `[.data.frame`(x, r, vars, drop = drop): 
   undefined columns selected

I'm not really sure how to code this and clearly haven't quite grasped loops yet so any help would be appreciated!

CodePudding user response:

The error is easily explained as you loop over 16 columns and in the end trying to select 16 1 which column index does not exists. You probably could loop over nX-1 instead, but I think what you try to achieve can be done more elegant.

TwoB<- read.csv("data.csv", header=F)

library("data.table")
setDT(TwoB)

nX <- ncol(TwoB)

# example to directly write your files
lapply(2:nX, function(col_index) {
    fwrite(TwoB[, c(1, ..col_index)], file = paste0("col1_col", col_index, ".csv"))
})

# example to store the new data.tables in a list
list_of_two_column_tables <- lapply(2:nX, function(col_index) {
    TwoB[, c(1, ..col_index)]
})
  • Related