Iteration/loop through imported data in R-CodePudding

I have created a working process below that allows for the graph's baseline correction for a given data set outlined below.


setwd("C:/Users/o/OneDrive/Desktop")

importData = (read.delim("OSJH103h.txt", header=F))
matrixData = as.matrix(importData)
swappedColRow = t(matrixData)
row.names(swappedColRow) = c(1,2)
removedColumn = swappedColRow[-c(1),]
matrixRemovedCol = as.matrix(removedColumn)
swappedMatrix = t(matrixRemovedCol)

bc.irls = baseline(swappedMatrix, lambda=2, hwi=100, it=10, int=2000, method = 'fillPeaks')
mf = getCorrected(bc.irls)
mf2d=data.frame(ys=mf[1,], xs=importData$V1)
par(mfrow=c(1,1))
plot(x=mf2d$xs, y=smooth(mf2d$ys), col=2, type="lines")

How would I import multiple data files that could be iterated/looped through and remove the baseline for each given dataset?

I have outlined a method for importing all the .txt files in a given directory.

temp = list.files(pattern="*.txt")
myfiles = lapply(temp, read.delim, header=FALSE)

The files are imported as [[1]], [[2]], [[3]]... Thus replacing 'importData' for myfiles[[2]] yields the same result

Looking for a way to import ~10/15 data sets at a time and remove the baseline for each. Then ideally, export corrected data to a separate txt file.

I hope this makes sense. Any help would be appreciated.

CodePudding user response：

Perhaps this:

library(baseline)
temp = list.files(pattern="*.txt")
reproc_base <- function(temp) {
    importData = lapply(temp, read.delim, header=FALSE)
    matrixData = lapply(importData,as.matrix)
    swappedColRow = lapply(matrixData, t)
    swappedColRow = lapply(swappedColRow, row.names,c(1,2)) # uncertain
# lapply(myList, function(x) { x["ID"] <- NULL; x }) SOF?12664430
    removedColumn = lapply(swappedColRow, function(x) {x[1, ] <- NULL; x}) # uncertain
    matrixRemovedCol = lapply(removedColumn, as.matrix)
    swappedMatrix = lapply(matrixRemovedCol, t)
    bc.irls = lapply(swappedMatrix, baseline, lambda=2, hwi=100, it=10, int=2000, method = 'fillPeaks')
    mf = lapply(bc.irls, getCorrected)
    return(mf)
}

#while debugonce(reproc_base), you'll probably just want 1 file
debugonce(reproc_base)
test_mf <- reproc_base(temp[1])

Well, as you see, there are a couple of notations that I'm uncertain about. But play with it in debugonce(reproc_base) or debug(reproc_base) and let's see where it breaks. And anonymous function SOF 12664430.

CodePudding user response：

My solution in case anyone is interested

temp = list.files(pattern="*.txt", full.names = T)
myfiles = lapply(temp, read.delim, header=FALSE)


for (i in 1:length(temp)){
  matrixData = as.matrix(myfiles[[i]])
  swappedColRow = t(matrixData)
  row.names(swappedColRow) = c(1,2)
  removedColumn = swappedColRow[-c(1),]
  matrixRemovedCol = as.matrix(removedColumn)
  swappedMatrix = t(matrixRemovedCol)
  
  bc.irls = baseline(swappedMatrix, lambda=2, hwi=100, it=10, int=2000, method = 'fillPeaks')
  # plot(bc.irls)
  mf = getCorrected(bc.irls)
  mf2d=data.frame(xs=myfiles[[i]]$V1, ys=mf[1,])
  par(mfrow=c(1,1))
  plot(x=mf2d$xs, y=smooth(mf2d$ys), col=2, type="l")
  
  
  
  teststr<-temp[i]
  str_sub(teststr,1,2)<-""
  str_sub(teststr,-4,str_length(teststr))<-""
  
  teststr
  write.csv(mf2d,paste0(teststr," BLC.csv"), row.names = FALSE)
}