Home > Back-end >  how to apply R script on all of my data knowing that i use 2 different extension in my code
how to apply R script on all of my data knowing that i use 2 different extension in my code

Time:10-29

I have create a R script that analyse and manipulate 2 different data frame extension, for exemple one task is to extract certain values from data and export it as a .txt file, here is the begining of my script and the data files that i use:

setwd('C:\\Users\\Zack\\Documents\\RScripts\***')

heat_data="***.heat"
time ="***.timestamp"
ts_heat = read.table(heat_data)
ts_heat = ts_heat[-1,]
rownames(ts_heat) <- NULL
ts_time = read.table(time)
back_heat = subset(ts_heat, V3 == 'H')
back_time = ts_time$V1
library(data.table)
datDT[, newcol := fcoalesce(
nafill(fifelse(track == "H", back_time, NA_real_), type = "locf"),
0)]
last_heat = subset(ts_heat, V3 == 'H')
last_time = last_heat$newcol
x = back_time - last_heat
dataest = data.frame(back_time , x)
write_tsv(dataestimation, file="dataestimation.txt")

I than use those 2 files to calculate and extract specific values. So can anyone plz tell me how can I run this script on each and every .heat and .timestamp files. My objective is to calculate and extract this values for each file. I note that each file contain .heat and .timestamp. I note also that I am a windows user.

Thank you for your help

CodePudding user response:

You can use list.files

heat_data <- list.files(pattern = ".*.heat")
time <- list.files(pattern = ".*.timestamp")

and then process each file in a loop (or use lapply)

for (i in heat_data) {
    h <- read.table(i)
    # other code
}

for (j in time) {
    t <- read.table(j)
    # other code
}

you may want to pass the path to list.files as well instead of using setwd:

heat_data <- list.files("your/path/", pattern = ".*.heat")

After edit question

Let's say you have 3 .heat files and 3 .timestamp files in your path named

1.heat
2.heat
3.heat
1.timestamp
2.timestamp
3.timestamp

so there is a correspondence between heat and timestamp (given by the file name).

You can read these files with

heat_data <- list.files("your/path/", pattern = ".*.heat")
time <- list.files("your/path/", pattern = ".*.timestamp")

At this point, create a function that does exactly what you want. This function takes as input only an index and two paths

function (i, heat_data, time) {
   ts_heat <- read.table (heat_data[i])
   ts_time <- read.table (time[i])
   
   #
   # other code
   #
   
   write_tsv(dataestimation, file = paste ("dataestimation", i, ".txt", sep = ""))
}

This way you will have files named dataestimation_1.txt, dataestimation_2.txt and dataestimation_3.txt.

Finally use lapply to call the function for all files in the folder

lapply (1: 3, heat_data, time)

This is just one of the possible ways to proceed.

  • Related