I have a loop which goes through a large number of .tsv files in R and creates one output file with the results. Each row in the output file corresponds to the results of processing each input file in turn. I need to look back at the input files and work out which each result in the output file corresponds to. I would therefore like the rownames for the output file (big_data), to be the names of the input tsv files, I have tried this in my loop but not working. Here is my abbreviated loop below which works when I remove the rownames line.
files <- list.files(path =".", pattern = ".tsv")
files
datalist = list()
for(i in 1:length(files)) {
other_trait <- read.table(files[i])
coloc_res = coloc::coloc.abf(dataset1 = other_trait, dataset2 = dataset2,p12 = 1e-5)
coloc_results=matrix(ncol=6,nrow=1,0)
coloc_results[1,]=coloc_res$summary
write.csv(coloc_results, paste0("processed_", basename(files[i])))
datalist[[i]] = coloc_results
big_data = do.call(rbind, datalist)
colnames(big_data)=c("n_snps","H0","H1","H2","H3","H4")
rownames(big_data)= paste0(basename(files[i]))
write.csv(big_data, "results.csv")
}
The line I am struggling with is rownames(big_data) = paste0 etc...
CodePudding user response:
Assuming coloc_results
is of class data.frame
#create list of files
files <- list.files(path =".", pattern = ".tsv")
#create list to bind results to
datalist = list()
#loop through files
for(i in 1:length(files)) {
#read table
other_trait <- read.table(files[i])
#desired analysis
coloc_res <- coloc::coloc.abf(dataset1 = other_trait, dataset2 = dataset2,p12 = 1e-5)
coloc_results <- matrix(ncol=6,nrow=1,0)
coloc_results[1,] <- coloc_res$summary
#write results of analysis to individual file
write.csv(coloc_results, paste0("processed_", basename(files[i])))
#add column containing information regarding the inputfile
coloc_results$inputfile <- basename(files[i])
#add results of analysis to list
datalist[[i]] = coloc_results
}
#merge list to one data.frame
big_data <- do.call(rbind, datalist)
#tid colnames
colnames(big_data) <- c("n_snps","H0","H1","H2","H3","H4", "inputfile)
#write to csv
write.csv(big_data, "results.csv")
Note that the do.call(rbind, datalist)
is now outside of the for loop. So first all items are added to the list, then the entire list is converted to one big dataframe. In your original code, you were overwritting results.csv
in every iteration.