Home > OS >  Is there a way to reference files in a folder within the working directory in R?
Is there a way to reference files in a folder within the working directory in R?

Time:02-19

I have already finished with my RMarkdown and I'm trying to clean up the workspace a little bit. This isn't exactly a necessary thing but more of an organizational practice which I'm not even sure if it's a good practice, so that I can keep the data separate from some scripts and other R and git related files.

I have a bunch of .csv files for data that I used. Previously they were on (for example)

C:/Users/Documents/Project

which is what I set as my working directory. But now I want them in

C:/Users/Document/Project/Data

The problem is that this only breaks the following code because they are not in the wd.

#create one big dataframe by unioning all the data
bigfile <- vroom(list.files(pattern = "*.csv"))

I've tried adding a full path to list.files() to where the csvs are but no luck.

bigfile <- vroom(list.files(path = "C:/Users/Documents/Project/Data", pattern = "*.csv"))
Error: 'data1.csv' does not exist in current working directory ('C:/Users/Documents/Project').

Is there a way to only access the /Data folder once for creating my dataframe with vroom() instead of changing the working directory multiple times?

CodePudding user response:

You can list files including those in all subdirectories (Data in particular) using list.files(pattern = "*.csv", recursive = TRUE)

Best practices

  • Have one directory of raw and only raw data (the stuff you measured)
  • Have another directory of external data (e.g. reference data bases). This is something you do can remove afterwards and redownload if required.
  • Have another directory for the source code
  • Put only the source code directory under version control plus one other file containing check sums of the raw and external data to proof integrity
  • Every other thing must be reproducible using raw data and the source code. This can be removed after the project. Maybe you want to keep small result files (e.g. tables) which take long time to reproduce.

CodePudding user response:

You can list the files and capture the full filepath name right?

bigfile <- vroom(list.files(path = "C:/Users/Documents/Project/Data", pattern = "*.csv", full.names = T))

and that should read the file in the directory without reference to your wd

CodePudding user response:

Try one of these:

# list all csv files within Data within current directory
Sys.glob("Data/*.csv")

# list all csv files within immediate subdirectories of current directory
Sys.glob("*/*.csv")

If you only have csv files then these would also work but seem less desirable. Might be useful though if you quickly want to review what files and directories are there. (I would be very careful not to use the second one within statements to delete files since if you are not in the directory you think it is in then you can wind up deleting files you did not intend to delete. The first one might too but is a bit safer since it would only lead to deleting wrong files if the directory you are in does have a Data subdirectory.)

# list all files & directories within Data within current directory
Sys.glob("Data/*")

# list all files & directories within immediate subdirectories of current directory
Sys.glob("*/*")
  •  Tags:  
  • r
  • Related