Home > front end >  Rendering many connected quarto documents
Rendering many connected quarto documents

Time:09-08

I have been trying the new quarto tool by Rstudio and it seems like I cannot do something that was doable in the older versions of the R Notebook or R Markdown alternatives.

I want to organise my workflow through several quarto documents (.qmd) and I want to generate the .html documents to share with people who are not R users. For example, let's suppose I have the following 4 documents:

  • 01_DataProcessing.qmd
  • 02_StatisticalAnalysis.qmd
  • 03_Plots.qmd
  • 04_Reports.qmd

In 01_DataProcessing.qmd I clean and organise all data to be used in the other three files. Therefore, the other three files depend on 01_DataProcessing.qmd, *01_DataProcessing.qmd has to be run firstly so the other files can run. Within Rstudio this works perfectly because all data in the environment is accessible to all .qmd files.

Nevertheless, when I want to generate the .hmtl files I have to "render" the .qmd files. Here is where I am finding problems. It seems like the rendering ignore all variables in the global environment (and all loaded libraries) and therefore it shows and error (Execution halted). It means that I can only work with stand-alone documents with all the code, something that might be problematic for large workflows.

Am I missing something? Do I have to change some of my settings? is there any workaround? How can I specify the .qmd to use all data available in the global environment?

Please note that with newest version of Rstudio changing to .rmd does not solve the problem since the behaviour is the same as with .qmd documents. Also, please not that in the past this was not the case.

Edit

To give a reproducible example, suppose in the first qmd file 01_DataProcessing.qmd, I have created a data.frame TestData and I want to use the TestData in the 03_Plots.qmd file

01_DataProcessing.qmd

---
title: "01_DataProcessing"
format: html
editor: visual
---

```{r}
library(tidyverse)
library(magrittr)

TestData <- data.frame(
  x = c(1, 2, 3, 4, 5, 6, 7),
  y = c(1 ,2 ,3, 4, 5, 6, 7))
```

03_Plots.qmd

---
title: "03_Plots"
format: html
editor: visual
---

If you try to render the file the execution is halted because the object TestData is not found. 

```{r}
plot(TestData$x, TestData$y)
```

CodePudding user response:

One approach could be saving the r objects from the first qmd file in a .Rdata file and then loading that .Rdata file at the very beginning of the second qmd document and also using an R-script file that contains the all library calls and source this r-script file at the beginning of each qmd file.

Now assuming that all of these related files are in the same directory (i.e folder),


globals.R

globals.R file contains all the necessary library calls and may contain common r objects (vectors, data.frame, etc) and common R.options that we would want to use in every qmd file and we will source this r-script file at the beginning of each qmd file.

library(tidyverse)
library(magrittr)

01_DataProcessing.qmd

We save the TestData1 and TestData2 data.frame objects in the data_process.Rdata file so that we can use them in the later files.

---
title: "01_DataProcessing"
format: html
---

```{r}
#| label: setup-globals
#| include: false

source("globals.R")
```

```{r}
TestData1 <- data.frame(
  x = c(1, 2, 3, 4, 5, 6, 7),
  y = c(1 ,2 ,3, 4, 5, 6, 7)
  )

TestData2 <- data.frame(
  x = 6:10,
  y = 6:10
  )
```

```{r}
#| include: false

save(TestData1, TestData2, file = "data_process.Rdata")
```

03_Plots.qmd

In this file, we can load that data_process.Rdata file and TestData1 and TestData2 data.frame objects are available to use.

---
title: "03_Plots"
format: html
---

```{r}
#| label: setup
#| include: false

source("globals.R")
load("data_process.Rdata")

```

```{r}
plot(TestData1$x, TestData1$y)
plot(TestData2$x, TestData2$y)
```

Now we can render these files sequentially and everything works as intended (Hopefully!).

Now to add some more suggestions

  • If we set up a Rstudio-project containing these files, we can use project specific .Rprofile instead of a global R file as done in this answer on SO which saves us from sourcing that r-script file in each qmd files.

  • If it is only the data.frame that we want to pass between sequential files, we can also write those data.frame as CSV files in a specific data directory inside the project directory and read them in later files (using the {here} package is recommended by many in such situations)

  • Related