How do I run the same code on all the files in a folder, and save the output on a new file under cer-CodePudding

I have a directory with a folder that contains 20 files (rdp1.csv, rdp2.csv, …) I want to run the following code for each file in this folder:

(Keep in mind, I was doing this manually in which I kept updating the df to each separate file name, but I want to change it so it just updates the df and runs the code through every file in the folder that contains “.csv” rather than doing it all seperately. 1) I think I need a loop here for it to work. Any idea?)

setwd(“/Users/e/RDP OA and YA/“)

df <- read.csv(“YA RDP20.csv”)

sh <- count(filter(df, key_resp_rdpslow.keys == “m” & key_resp_rdpslow.corr == 1)

This would provide me with a value for sh, which I would type into a new file. 2) Additionally, how can I do this in R in which it also transfers this output into a new blank file where there are two columns; for instance - one column for name which saves the file name and the other column for the corresponding ‘sh’ output, and the loop would add on each new value to these columns. I’m fairly new to loops in R which is why I could use some help.

** Edited code:

all_csvs <- list.files("RDP OA and YA/", full.names = TRUE) |> 
  stringr::str_subset("\\.csv$")

shs <- sapply(all_csvs, \(x) {
   df <- read.csv(x)
   sh <- count(filter(df, key_resp_rdpslow.keys == "m" & key_resp_rdpslow.corr == 1))*2
   sf <- count(filter(df, key_resp_rdpslow.keys == "m" & key_resp_rdpslow.corr == 0))*2 
   fh <- count(filter(df, key_resp_rdpfast.keys == "m" & key_resp_rdpfast.corr == 1))*2 
   ff <- count(filter(df, key_resp_rdpfast.keys == "m" & key_resp_rdpfast.corr == 0))*2 
   return(list(sh = sh, sf = sf, fh = fh, ff = ff))
  } )

out <- data.frame(csv_file = all_csvs,
              slowhits = sapply(shs, \(x) x$sh),
              slowfa = sapply(shs, \(x) x$sf),
              fasthits = sapply(shs, \(x) x$fh),
              fastfa = sapply(shs, \(x) x$ff))

write.csv(out, "out.csv")'

Thanks!

CodePudding user response：

On the 'runs the code through evry file in the folder that contains .csv', here is a piece of code that might be helpful. You can list all the files in a folder, and then subset the ones that finish with .csv using regular expression. Then using the apply functions family, you can run the same piece of code on each one of them:

library(tidyverse)

all_csvs <- 
  list.files("RDP OA and YA/", full.names = TRUE) |> # pipe operator 
  stringr::str_subset("\\.csv$")
  
shs <- lapply(all_csvs, \(x) {
  df <- read.csv(x)
  sh <- count(filter(df, key_resp_rdpslow.keys == "m" & key_resp_rdpslow.corr == 1))*2
  sf <- count(filter(df, key_resp_rdpslow.keys == "m" & key_resp_rdpslow.corr == 0))*2 
  return(list(sh = pull(sh), sf = pull(sf)))
})

out <- data.frame(csv_file = all_csvs,
                  slowhits = sapply(shs, \(x) x$sh),
                  slowfa = sapply(shs, \(x) x$sf))
> print(out)
                        csv_file slowhits slowfa
1         RDP OA and YA/rdp1.csv       78      6
2        RDP OA and YA/rdp11.csv       74     10
3        RDP OA and YA/rdp12.csv       84     14
4     RDP OA and YA/rdp16(2).csv       98     18
5     RDP OA and YA/rdp16(3).csv       42      6
6        RDP OA and YA/rdp16.csv       62     10
7     RDP OA and YA/rdp19(2).csv       94     16
8        RDP OA and YA/rdp19.csv      100    100
9     RDP OA and YA/rdp20(2).csv       90     20
10       RDP OA and YA/rdp20.csv       68     18
11       RDP OA and YA/rdp21.csv       94      8
12       RDP OA and YA/rdp23.csv       74     28
13    RDP OA and YA/rdp24(2).csv       94     22
14    RDP OA and YA/rdp24(3).csv       70      6
15       RDP OA and YA/rdp24.csv       88      8
16     RDP OA and YA/rdp3(2).csv       68     48
17        RDP OA and YA/rdp3.csv       80     10
18     RDP OA and YA/rdp4(2).csv       90     56
19        RDP OA and YA/rdp4.csv       56     10
20        RDP OA and YA/rdp7.csv       86      2
21    RDP OA and YA/YA RDP11.csv       96     12
22    RDP OA and YA/YA RDP20.csv       78     16
23    RDP OA and YA/YA RDP21.csv       98     26
24 RDP OA and YA/YA RDP23(2).csv       72      0
25    RDP OA and YA/YA RDP23.csv       74     14
26    RDP OA and YA/YA RDP24.csv       76     12
27  RDP OA and YA/YA RDP3(2).csv       84     18
28     RDP OA and YA/YA RDP3.csv       86     24
29     RDP OA and YA/YA RDP4.csv       84      4
30     RDP OA and YA/YA RDP7.csv       76     18

write.csv(out, "out.csv")

Edit update - I added also the creating of a dataframe, which you can then save. A column will be the csv paths, the other the sh value.