I have been trying the new quarto tool by Rstudio and it seems like I cannot do something that was doable in the older versions of the R Notebook or R Markdown alternatives.
I want to organise my workflow through several quarto documents (.qmd) and I want to generate the .html documents to share with people who are not R users. For example, let's suppose I have the following 4 documents:
- 01_DataProcessing.qmd
- 02_StatisticalAnalysis.qmd
- 03_Plots.qmd
- 04_Reports.qmd
In 01_DataProcessing.qmd I clean and organise all data to be used in the other three files. Therefore, the other three files depend on 01_DataProcessing.qmd, *01_DataProcessing.qmd has to be run firstly so the other files can run. Within Rstudio this works perfectly because all data in the environment is accessible to all .qmd files.
Nevertheless, when I want to generate the .hmtl files I have to "render" the .qmd files. Here is where I am finding problems. It seems like the rendering ignore all variables in the global environment (and all loaded libraries) and therefore it shows and error (Execution halted). It means that I can only work with stand-alone documents with all the code, something that might be problematic for large workflows.
Am I missing something? Do I have to change some of my settings? is there any workaround? How can I specify the .qmd to use all data available in the global environment?
Please note that with newest version of Rstudio changing to .rmd does not solve the problem since the behaviour is the same as with .qmd documents. Also, please not that in the past this was not the case.
Edit
To give a reproducible example, suppose in the first qmd
file 01_DataProcessing.qmd, I have created a data.frame TestData
and I want to use the TestData
in the 03_Plots.qmd file
01_DataProcessing.qmd
---
title: "01_DataProcessing"
format: html
editor: visual
---
```{r}
library(tidyverse)
library(magrittr)
TestData <- data.frame(
x = c(1, 2, 3, 4, 5, 6, 7),
y = c(1 ,2 ,3, 4, 5, 6, 7))
```
03_Plots.qmd
---
title: "03_Plots"
format: html
editor: visual
---
If you try to render the file the execution is halted because the object TestData is not found.
```{r}
plot(TestData$x, TestData$y)
```
CodePudding user response:
One approach could be saving the r objects from the first qmd
file in a .Rdata
file and then loading that .Rdata
file at the very beginning of the second qmd
document and also using an R-script file that contains the all library calls and source this r-script file at the beginning of each qmd
file.
Now assuming that all of these related files are in the same directory (i.e folder),
globals.R
globals.R
file contains all the necessary library calls and may contain common r objects (vectors, data.frame, etc) and common R.options that we would want to use in every qmd
file and we will source this r-script file at the beginning of each qmd
file.
library(tidyverse)
library(magrittr)
01_DataProcessing.qmd
We save the TestData1
and TestData2
data.frame objects in the data_process.Rdata
file so that we can use them in the later files.
---
title: "01_DataProcessing"
format: html
---
```{r}
#| label: setup-globals
#| include: false
source("globals.R")
```
```{r}
TestData1 <- data.frame(
x = c(1, 2, 3, 4, 5, 6, 7),
y = c(1 ,2 ,3, 4, 5, 6, 7)
)
TestData2 <- data.frame(
x = 6:10,
y = 6:10
)
```
```{r}
#| include: false
save(TestData1, TestData2, file = "data_process.Rdata")
```
03_Plots.qmd
In this file, we can load that data_process.Rdata
file and TestData1
and TestData2
data.frame objects are available to use.
---
title: "03_Plots"
format: html
---
```{r}
#| label: setup
#| include: false
source("globals.R")
load("data_process.Rdata")
```
```{r}
plot(TestData1$x, TestData1$y)
plot(TestData2$x, TestData2$y)
```
Now we can render these files sequentially and everything works as intended (Hopefully!).
Now to add some more suggestions
If we set up a Rstudio-project containing these files, we can use project specific
.Rprofile
instead of a global R file as done in this answer on SO which saves us from sourcing that r-script file in eachqmd
files.If it is only the data.frame that we want to pass between sequential files, we can also write those data.frame as CSV files in a specific data directory inside the project directory and read them in later files (using the
{here}
package is recommended by many in such situations)