Try-it-together: Generate Reports and Presentation with Quarto

2024-10-22

Bella Ratmelia

Overview for today

  • Quarto for Academic Writing
    • Code Chunks
    • Citations and footnotes
    • Front Matter
    • Journal-specific templates (e.g. PLOS and ACM)
  • Quarto for presentation (RevealJS)

Quarto in the research cycle

Infographic by Kramer & Bosman

Literate Programming

Literate Programming, introduced by Donald Knuth in the 1980s, is a programming paradigm that emphasizes the intertwining of human-readable documentation and source code.

Essentially:

  • The program is written as a coherent narrative where code segments and explanations are woven together in a way that emphasizes understanding and readability

  • The code segments ordered in a logical manner for the reader, rather than the order required by the compiler.

  • The narrative format helps to bridge the gap between the code and the theoretical framework, ensuring that the computational steps are aligned with the objectives.

  • In Quarto, this is enabled through Code Chunks

Section 1: Code Chunks

How does it look like?

An example of code chunk:

library(tidyverse)                               

diamonds %>% ggplot(aes(x = color, fill = cut)) + 
    geom_bar()

How does it all work?

Magic! (just kidding)

Image by Allison Horst (allisonhorst.com)

Anatomy of a code chunk

For R, the code chunks are generated with the help of knitr package.

Each code chunk will have a list of cell options that looks like this if you use source view:

```{r}
#| label: fig-polar
#| echo: false
#| output: true
```

The complete list of code chunk options for Knitr is in this documentation page, but the important ones are:

  • echo - Whether to display the source code in the rendered output (true/false)

  • output - Whether to display the output of the code (true/false)

  • label - Unique label for the code chunks - useful for cross-referencing!

  • output-location - Location of output relative to the code that generates it (more relevant for presentations)

Code Highlighting

  • Use highlight-style to specify the code highlighting style by choosing from the supported themes: a11y, arrow, atom-one, ayu, breeze, github, gruvbox

    • The themes are “adaptive” and will automatically switch between dark and light modes based on the website’s theme.
  • Use code-line-numbers to highlight specific lines of codes (this will make more sense for presentation, but you can also apply this to static documents)

```{r}
#| echo: true
#| output: false
#| code-line-numbers: "3,4"
#| highlight-style: github
#| code-overflow: wrap

library(tidyverse)

diamonds %>% ggplot(aes(x = price)) +
  geom_histogram(binwidth = 500, fill = "blue", color = "black") +
  labs(title = "Histogram of Diamond Prices", 
       x = "Price (USD)", 
       y = "Frequency")
```
library(tidyverse)

diamonds %>% ggplot(aes(x = price)) +
  geom_histogram(binwidth = 500, fill = "blue", color = "black") +
  labs(title = "Histogram of Diamond Prices", 
       x = "Price (USD)", 
       y = "Frequency")

Code Annotations

1library(tidyverse)

2diamonds %>% ggplot(aes(x = color, fill = cut)) +
    geom_bar()
1
Load the tidyverse library
2
Visualize the distribution of color and kind of cuts for each color

  • Code blocks and executable code cells in Quarto can include line-based annotations to further explain the code and the flow of the logic to your readers.

  • Great for teaching / presentation!

Anatomy of a code annotation

Syntax (in visual editor):

Output:

  • Each annotated line must end with a comment using the language-specific comment character for the code cell, followed by a space and the annotation number enclosed in angle brackets (e.g., # <1>).

  • If the annotation covers multiple lines, the same annotation number can be repeated.

  • After the code cell, provide an ordered list that details the contents of each annotation. Each item in this list should correspond to the line(s) of code marked with the same annotation number.

Let’s try code chunks together! (10 minutes)

Amend the YAML header of the following code chunks to fit their description. Refer to this documentation page for the available knitr cells option. Use the source editor for this exercise.

Chunk 1: Give the chunk below a label called “basic-chunk”

```{r}

# Create a simple data frame
df <- data.frame(
  x = 1:5,
  y = c(2, 4, 6, 8, 10)
)
print(df)
```

Chunk 2: make this chunk hide output and show code:

```{r}
#| label: hide-output

# Calculate correlation
cor_xy <- cor(df$x, df$y)
print(paste("The correlation between x and y is:", cor_xy))
```

Chunk 3: make this chunk show output, hide code:

```{r}
#| label: hide-code

# Calculate mean of y
mean_y <- mean(df$y)
print(paste("The mean of y is:", mean_y))
```

Chunk 4: Change figure width to 6, figure height to 4, and give it a caption.

```{r}
#| label: plot-options

plot(df$x, df$y, main="Scatter Plot", xlab="X", ylab="Y")
```

Chunk 5: Suppress warnings and messages

```{r}
#| label: suppress-warnings

library(dplyr)  # This usually prints a message

# This operation usually gives a warning
1:3 + 1:2
```

Let’s try code annotations together! (10 minutes)

Convert the comments inside the code chunks below into annotations. Feel free to amend the comment to your liking if you think it’s not descriptive enough to the audience. Render the document to see how it looks like.

Chunk 6:

```{r}
#| label: annotation-1

numbers <- c(10, 20, 30, 40, 50) # create a number vector
mean_value <- mean(numbers) #Calculate the average
print(mean_value)
```

Chunk 7:

```{r}
#| label: data-manipulation

library(dplyr) # load the dplyr library

# Create a sample dataset
df <- data.frame(
  name = c("Alice", "Bob", "Charlie", "David"),
  age = c(25, 30, 35, 28),
  score = c(85, 92, 78, 95)
)

result <- df %>%
  filter(age > 25) %>% # Keep only rows where age > 25
  mutate(grade = case_when(
    score >= 90 ~ "A",
    score >= 80 ~ "B",
    TRUE ~ "C"
  )) %>% # Add a new column 'grade' based on score
  arrange(desc(score)) # Sort by score in descending order

print(result)
```

Chunk 8:

```{r}
#| label: plot-annotation

library(ggplot2) # load the ggplot2 library

ggplot(df, aes(x = age, y = score)) + # set the x and y axis
  geom_point() + # Add scatter plot points
  geom_smooth(method = "lm", se = FALSE) + # Add a linear regression line
  labs(title = "Age vs Score", 
       x = "Age", 
       y = "Score") # Set plot labels
```

Chunk 9:

```{r}
#| label: annotation-2

seq_numbers <- seq(1, 10, by = 2) # Create a sequence from 1 to 10, step 2
print(seq_numbers) # print the sequence of number
```

Section 2: Citations and Bibliography

Citations in Quarto

By default, Quarto will use Pandoc engine to convert the in-text citations and generate the references in your document. You will need the following components:

  1. A quarto document formatted with in-text citations in Rmarkdown syntax (more on this later).

  2. A bibliographic file, e.g. BibLaTeX (.bib) or BibTeX (.bibtex) file.

  3. A Citation Style Language (CSL) file which specifies the formatting to use when generating the citations and bibliography (when not using natbib or biblatex to generate the bibliography).

Bibliographic data source + CSL file

Both files have to be specified in the YAML header like so: (In this example, the .bib file and the .csl file is located in the same folder as the .qmd document.)

---
title: "Manuscript"
bibliography: references.bib
csl: nature.csl
---
  • references.bib is the bibliographic text file. This will also be automatically generated after you include a citation in your document for the first time.

  • nature.csl is the citation style document, in this example is the nature citation style.

In-text citations

Common ones:

Syntax Output
@katz2021 mentioned that… @katz2021 mentioned that…
Katz et al. [-@katz2021] mentioned that… Katz et al. [-@katz2021] mentioned that…
Software citation is good [@katz2021, pp. 33-35] Software citation is good [@katz2021, pp. 33-35]!
More researchers are saying that software citation is good [@katz2021; @park2019] More researchers are saying that software citation is good [@katz2021; @park2019]
  • Insert in-text citations by typing @ which will trigger a popup of items saved in your Zotero library.

  • Inserting citations and footnotes is generally easier in Visual editor :

    • click on Insert > Citation, which will bring up a popup box where you can choose your citation source!

    • click on Insert > Footnotes to add footnotes

Citation Sources

Other than your Zotero library, here are the sources that you can retrieve from:

The References section

By default, Quarto will place the references section at the end of the document. You can also specify the placement by putting this section in your document (note that the example below is the source view on Quarto):

### References

::: {#refs}
:::

Which will print out the output below:

Let’s try citations together! (15 minutes)

On the next slide is a snapshot of a paragraph from the tidyverse homepage, with hard-coded in-text citations and reference section.

  1. Copy over the paragraph into your Quarto document.
  2. Replace the hard-coded in-text citations (marked in bold) with Quarto syntax. You can either search the items using DOI provided, or add the items to your Zotero library.
  3. Set the Reference section to appear right after the paragraph, replacing the “References used” section.
  4. Change the reference style to use APA style (any edition will do). Download the CSL file from https://github.com/citation-style-language/styles.
  5. Add the URLs to the tidyverse packages mentioned in the paragraph (dplyr, tidyr, tibble, and readr) as footnotes.

Let’s try citations together! (15 minutes)

Paragraph:

There are a number of projects that are similar in scope to the tidyverse. The closest is perhaps Bioconductor (Gentleman et al. 2004; Huber et al. 2015), which provides an ecosystem of packages that support the analysis of high-throughput genomic data. The tidyverse has similar goals to R itself, but any comparison to the R Project (R Core Team 2019) is fundamentally challenging as the tidyverse is written in R, and relies on R for its infrastructure; there is no tidyverse without R! That said, the biggest difference is in priorities: base R is highly focused on stability, whereas the tidyverse will make breaking changes in the search for better interfaces. Another closely related project is data.table by Dowle and Srinivasan (2019), which provides tools roughly equivalent to the combination of dplyr, tidyr, tibble, and readr. data.table prioritises concision and performance.

References used:

Huber, W., V. J. Carey, R. Gentleman, S. Anders, M. Carlson, B. S. Carvalho, H. C. Bravo, et al. 2015. “Orchestrating High-Throughput Genomic Analysis with Bioconductor.” Nature Methods 12 (2): 115–21. https://www.nature.com/articles/nmeth.3252.

Gentleman, R.C., Carey, V.J., Bates, D.M. et al. Bioconductor: open software development for computational biology and bioinformatics. Genome Biol 5, R80 (2004). https://doi.org/10.1186/gb-2004-5-10-r80

R Core Team. 2019. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.

Dowle, Matt, and Arun Srinivasan. 2019. data.table: Extension of ’Data.frame‘. https://CRAN.R-project.org/package=data.table.

Section 3: Front Matter and Academic Templates

Academic Templates with Quarto Journals

  • Quarto provides extensions for manuscript writing that contains styles specific for several journals/publishers, such as PLOS, ACM, JOSS, Elsevier, and more.

  • These extensions provide rich YAML metadata specifically for academic writing (often referred as “Front Matter” metadata).

  • Let’s dive into these Front Matter YAML metadata first before we explore the templates!

Front Matter

  • Scholarly articles demand extensive details in their front matter, beyond just a title and author.

  • Quarto offers a comprehensive range of YAML metadata keys to include these details.

  • This metadata covers specifying authors and their affiliations, abstract, keywords, copyright, licensing, and funding.

Below is a YAML header example:

---
title: "Library Carpentry: Best practices in organizing shelf space in the library"
date: 2024-07-01
author:
  - name: Bella Ratmelia
    id: br
    orcid: 0000-0003-4913-9508
    email: bellar@smu.edu.sg
    corresponding: true
    affiliation: 
      - name: Singapore Management University
        city: Singapore
        url: www.smu.edu.sg
  - name: Danping Dong
    id: dp
    orcid: 0000-0003-4913-9508
    email: bellar@smu.edu.sg
    affiliation: 
      - name: Singapore Management University
        city: Singapore
        url: www.smu.edu.sg
abstract: Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
keywords:
  - Library
  - Carpentry
license: "CC BY"
copyright: 
  holder: Bella Ratmelia
  year: 2024
citation: 
  container-title: Journal of Library Carpentry
  volume: 1
  issue: 1
  doi: 10.5555/12345678
funding: "The author received no specific funding for this work."
---

Article-related metadata options

These metadata include things like abstract, keywords, license, copyright, and funding information.

---
abstract: Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
keywords:
  - Library
  - Carpentry
license: "CC BY"
copyright: 
  holder: Bella Ratmelia
  year: 2024
funding: "The author received no specific funding for this work."
---

Metadata for citable articles - web article

For articles published to the web, include author, date and citation url metadata. For example:

---
title: "Library Carpentry: Best practices in organizing shelf space in the library"
description: | 
  Best practices in organizing shelf space in the library
date: 2024-07-01
author:
  - name: Bella Ratmelia
    id: br
    orcid: 0000-0003-4913-9508
    email: bellar@smu.edu.sg
    corresponding: true
    affiliation: 
      - name: Singapore Management University
        city: Singapore
        url: www.smu.edu.sg
citation: 
  url: https://smu.edu.sg/library
bibliography: references.bib
---

Metadata for citable articles - journal article

For journal articles, there are additional metadata that needs to be included such as volume, issue, publisher, and page numbers, like so:

---
citation: 
  type: article-journal
  container-title: "Journal of Library Carpentry"
  volume: 1
  issue: 1
  doi: 10.5555/12345678
  url: https://example.com/summarizing-output
bibliography: references.bib
---  

Tip

The front matter metadata in Quarto is based on the schema from Citation Style Language project (expressed as YAML instead of XML). See the complete list of options in this documentation page.

Front Matter - Rendering result in HTML

Info on licensing and citing

Let’s try Front Matter together! (10 minutes)

Copy and paste the following Front Matter template to your quarto document:

---
title: "Library Carpentry: Best practices in organizing shelf space in the library"
date: 2024-07-01
author:
  - name: Bella Ratmelia
    id: br
    orcid: 0000-0003-4913-9508
    email: bellar@smu.edu.sg
    roles: "Shelf Blueprint"
    corresponding: true
    affiliation: 
      - name: Singapore Management University
        city: Singapore
        url: www.smu.edu.sg
  - name: Danping Dong
    id: dp
    orcid: 0000-0002-2229-6709
    email: dpdong@smu.edu.sg
    roles: "Materials Procurement"
    affiliation: 
      - name: Singapore Management University
        city: Singapore
        url: www.smu.edu.sg
abstract: Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
keywords:
  - Library
  - Carpentry
license: "CC BY"
copyright: 
  holder: Bella Ratmelia
  year: 2024
citation: 
  container-title: Journal of Library Carpentry
  volume: 1
  issue: 1
  doi: 10.5555/12345678
funding: "The author received no specific funding for this work."
---
  1. Add another author to the list of authors above, and make this new author the corresponding author.
  2. Amend the information in the sample to include author roles (pick any!) using the CRediT taxonomy.
  3. Render the page to HTML. Does the output look different than what you expected?

Let’s try PLOS template together! (5 minutes)

  1. Go to the list of Quarto extensions for scholarly articles: https://quarto.org/docs/extensions/listing-journals.html
  2. Let’s try creating a new document using the PLOS template. In your Rstudio’s Terminal, type the following: quarto use template quarto-journals/plos
  3. Render the article. How does the rendered result look?
  4. Next, let’s use the document that we have been using for this session and render it to follow PLOS template. Switch back to that document and type this in your Terminal: quarto render your-document-name.qmd --to plos-pdf

Rendering to Docx

By default, Quarto will render document output to HTML. We can change it to render to Word by changing the YAML header like so:

---
title: "Library Carpentry: Best practices in organizing shelf space in the library"
format:
  docx:
    toc: true
    number-sections: true
    highlight-style: github
---

Note

You need to have Microsoft Word installed to be able to produce and view the Word output.

Rendering to PDF

Similar to docx, you can change the render output to PDF by amending the YAML header like so:

---
title: "Library Carpentry: Best practices in organizing shelf space in the library"
format:
  pdf:
    toc: true
    number-sections: true
    colorlinks: true
    highlight-style: github
---

Note

Latest version of Quarto has a built-in built in PDF compilation engine, which among other things performs automatic installation of TinyTex and any missing TeX packages (required for LaTeX rendering)

If you encounter persistent errors when rendering to PDF, a workaround that I like to use is to render it to an HTML page, and then “print” them as PDF.

Note

You can update or install TinyTex in the RStudio Terminal with this command:

quarto install tinytex

Section 4: Presentation with RevealJS

Why RevealJS (and not ppt?)

  • Not a proprietary format - it is rendered as HTML slides which you can put on GitHub if you’d like to host it online.

  • Being open-source, Reveal.js is free to use, which eliminates licensing costs associated with PowerPoint.

  • Extensive customization options through HTML, CSS, and JavaScript - and easily switch to HTML or PDF.

  • Presentations are HTML-based and can be accessed via any web browser without needing specific software.

  • Works across different operating systems and devices without compatibility issues.

  • Presentations can be designed to be responsive and accessible, ensuring they look good on any device or screen size.

  • Presentations can be hosted locally for offline access or online for easy sharing.

Rendering to Presentation

Similar to docx and PDF, we can change the render output format to revealjs through the YAML header like so:

---
title: "Habits"
author: "John Doe"
format: revealjs
---

Note

Fun Fact: The slides for this workshops are created with Quarto and RevealJS!

YAML header options for RevealJS presentations

The complete list of options is in this documentation page. Here are several ones that you may find useful:

  • incremental - controls whether to show all bullet points at once, or as you progress the slides.

  • slide-number - controls whether to show slide numbers (will appear at the bottom right corner)

  • theme - Theme name, theme scss file, or a mix of both.

  • scrollable - controls whether to allow content that overflows slides vertically to scroll. This can also be set per-slide by including the .scrollable class on the slide title.

Let’s create a RevealJS presentation together!

Let’s explore the following:

  • Columns on slides
  • Scrollable slides
  • Incremental options
  • Code blocks options
  • Presenting slides
  • Preview Links
  • Slides appearance:
    • Slides sizes
    • Changing fonts
    • slide numbers
    • footers and logos
  • Publishing your slides to Quarto Pub

Sample contents for slides (in Source view)

## The Diamonds dataset

```{r}
#| label: load-library
#| lst-label: lst-loadlib
#| lst-cap: Load libraries
#| echo: true

library(tidyverse)
library(corrplot)
library(gtsummary)
```

The dataset, available through `ggplot2` package, contains the prices and other attributes of over 50,000 round cut diamonds, specifically 53,940 diamonds. It includes various details such as the price, weight, cut quality, color, clarity, and dimensions of the diamonds. Below is a table detailing the variables included in the dataset:

| **Variable** | **Description**                                                                                   |
|--------------|----------------------------------------------------------|
| price        | Price in US dollars (\$326–\$18,823)                                                              |
| carat        | Weight of the diamond (0.2–5.01)                                                                  |
| cut          | Quality of the cut (Fair, Good, Very Good, Premium, Ideal)                                        |
| color        | Diamond color, from D (best) to J (worst)                                                         |
| clarity      | A measurement of how clear the diamond is (I1 (worst), SI2, SI1, VS2, VS1, VVS2, VVS1, IF (best)) |
| x            | Length in mm (0–10.74)                                                                            |
| y            | Width in mm (0–58.9)                                                                              |
| z            | Depth in mm (0–31.8)                                                                              |
| depth        | Total depth percentage = z / mean(x, y) = 2 \* z / (x + y) (43–79)                                |
| table        | Width of top of diamond relative to widest point (43–95)                                          |

## Diamonds overview

```{r}
#| label: view-data

head(diamonds)
```

End of Session

Thank you for your active participation!

Please tell us one thing you liked about the course and one area of improvement here!