Home > Mobile >  Loop over grouped id column in Rmarkdown and render PDF
Loop over grouped id column in Rmarkdown and render PDF

Time:12-21

I have 2 columns in a dataset: id and text

Multiple texts exist for the same id. My goal is to generate multiple PDF files (one for each ID) by looping through the id numbers. However, I want each pdf to contain ALL the texts for that ID number (in a table format using knitr::kable())

Here is a reproducible sample of the .Rmd file that I have:

---
title: "Loop over grouped IDs"
output:
  pdf_document:
    latex_engine: xelatex
params:
  id: i
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE, include= FALSE)

library(tidyverse)

df <- tibble(
  text = c(
    "text one for id#1",
    "text two for id#1",
    "text one for id#12",
    "text one for id#13",
    "text two for id#13",
    "text three for id#13",
    "text one for id#15",
    "text two for id#15"
  ),
  id = c(1, 1, 12, 13, 13, 13, 15, 15)
)

df_id_filtered <- df %>% filter(id == params$id)
```

## Hello ID\#`r df_id$id[i]`!

These are the collections of texts that belong to you

```{r, echo=FALSE, results='asis'}

texts <- df_id_filtered$text
table <- knitr::kable(texts, col.names = "text")
```

`r table`

I created an .R script for the loop code which is the following:

library(rmarkdown)
library(knitr)

# loop through the id rows in the filtered data frame and generate a pdf report for each ID with all the texts in the "text" column for that ID

for (i in seq_along(df_id_filtered)) {
    rmarkdown::render(input = "idText.Rmd",
                      params = list(id = i),
                      output_format = "pdf_document",
                      output_file = paste0("File", "_", "ID#", i, ".pdf"))
}

How is the loop linked to the params: id exactly? If I loop over the entire df and not the df_id_filtered then the texts for the same ID number will be in separate files.

Is seq_along() the right choice here? And what should be in the params = list()?

The code I have works but it doesn't run for the entire unique IDs (only for 2 of them).

Any help is greatly appreciated! Thanks!

CodePudding user response:

I think seq_along(df_id_filtered) is not the correct choice here if you want to loop over all the ID's. df_id_filtered is dataframe and seq_along over it would iterate over the columns. Since you have 2 columns in your data it runs only for 2 ID's.

You can instead try -

library(rmarkdown)
library(knitr)

for (i in unique(df$id)) {
  rmarkdown::render(input = "idText.Rmd",
                    params = list(id = i),
                    output_format = "pdf_document",
                    output_file = paste0("File", "_", "ID#", i, ".pdf"))
}

So here we loop over each unique id's in the data and write a pdf for it.

  • Related