R Markdown documents in RStudio

Uwe Graichen · uwe.graichen@kl.ac.at

Overview

Using R Markdown documents, it is possible to apply the literate programming paradigm in scientific data analysis. For this, analysis code, links to data, results and explanatory text, which can also contain formulas and images, are combined in one document. In addition to the results, the logic and thought process of the data analysis can be described in a single document. In this post we describe how R Markdown documents can be created using RStudio, how they are structured and the essential components of such a document.

R Markdown documents in RStudio – Fundamentals and elements of literate programming in data analysis

Properties and application of R Markdown files

R Markdown (file extension .Rmd) is the common document format used by RStudio. The Rmd format allows to combine Gnu R code, results of calculations and visualisations, and explanatory text (this also includes images and formulas) in a single document. R Markdown documents are therefore well suited for storing analysis steps, own thoughts behind them as well as results of statistical analyses. Thus, it enables the literate programming paradigm in data sciences.

The Rmd format provides the functionality of an analysis and authoring framework for data science. This means we can use R Markdown files in two different ways:

  • As an analysis notebook, the code is executed directly within the document and the results appear within the document or
  • As a report generator, all code chunks of the document are executed, and a high-quality output document is rendered. HTML, PDF, DOCX and many other output formats are available.

In this blog post we will introduce these two uses of R Markdown documents. We will also give a compact overview of the structure and elements of R Markdown documents. If you want to follow the examples in this blog post, you need a working installation of the RStudio software.

Create a new R Markdown files using RStudio

To create a new R Markdown files, we select in the menu bar of RStudio File->New File->R Markdown.

Create a new R Markdown files

Subsequently, a pop-up window opens. Here you can enter the document title and the name of the author as well as make some basic configurations.

Open a new R Markdown files

After pressing the ‘OK’ button, a new R Markdown document is created. It contains a header in YAML format at the top with information about the title and author as well as document setups. This is followed by a simple template on how to structure an Rmd document.

R Markdown documents as scientific data analysis notebooks

An R Markdown document can be used as a scientific data analysis notebook. These documents can contain Gnu R code cells, which can be executed and the results, including graphical outputs, appear directly below the code cell. The R Markdown sample document contains three Gnu R code blocks. These are enclosed by ```{r} ... ``` and their background colour in RStudio is light grey. The code blocks can be executed individually by clicking the green triangle on the right edge. All code blocks in a R Markdown document can be executed using the pull-down menu Run->Run All.

The results of the code blocks, after their execution, appear appear directly below within the document.

Generate a reports from R Markdown documents

Using RStudio we can also create a formatted analysis report from a R Markdown file. We can trigger the report creation by clicking on the ‘Knit’ button. Afterwards, all code blocks of the document are executed sequentially, the R Markdown elements contained in the document are rendered, and the result is converted into the desired target format. In the example, we create an HTML document.

Structural elements of R Markdown documents

The R Markdown format provides a variety of elements that we can use to structure and annotate these analysis documents.Below we give a brief overview of how Rmd documents are structured and which elements they can contain. The following is just a brief overview of the features of R Markdown. For more detailed information, see section Further information sources at the end of this post.

At the beginning of a R Markdown document you find a YAML metadata block. The entire layout and behaviour of the document must be specified in this YAML metadata block. The metadata is enclosed between a pair of three dashes ---, e.g.

1---
2title: "My first analysis script"
3author: "Uwe Graichen"
4date: "2022-10-13"
5output:
6  html_document: default
7---

If we create a R Markdown document using RStudio, a YAML metadata block is automatically inserted at the beginning of the document.

The YAML metadata block is followed by the text and code elements of the document. The following structuring elements can be used.

Heading

Using hierarchical headings, R Markdown documents can be structured into sections and subsections. Headings are constructed with a # for each level at the beginning of a line. The highest heading level in a document is 2, level 1 is reserved for the title. Examples of structuring headings are:

1## Heading 
2### Heading 
3#### Heading 
4##### Heading 
5###### Heading 

Emphasis

R Markdown offers various options for highlighting text passages, including bold, italic and strikethrough font. Short examples of the three alternatives are shown below.

Bold For bold typesetting, the relevant text passage has to be delimited by two asterisks.

1**rendered as bold text**

It is rendered to:

rendered as bold text

Italics For italic typesetting, the relevant text passage has to be delimited by one asterisks.

1*rendered as italicized text*

It is rendered to:

rendered as italicized text

Strikethrough For strike trough typesetting, the relevant text passage has to be delimited by two tildes.

1~~Strike through this text.~~

It is rendered to:

Strikethrough this text.

Blockquotes

Blockquote elements enable longer quoted text passages to be highlighted as a separate paragraph. The text passage that is to be displayed as a block quote is preceded by a >.

1> Lorem ipsum dolor sit amet, consectetur adipisici elit, sed eiusmod tempor incidunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquid ex ea commodi consequat. Quis aute iure reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint obcaecat cupiditat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum. 

It is rendered to:

Lorem ipsum dolor sit amet, consectetur adipisici elit, sed eiusmod tempor incidunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquid ex ea commodi consequat. Quis aute iure reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint obcaecat cupiditat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

Lists

Two types of lists can be inserted into R Markdown documents, unordered and ordered. Lists can be structured by means of sub-lists. For this, the sub-list items must be indented by two spaces.

Unordered lists

You may use any of the following symbols to denote bullets for each list item.

1* valid bullet
2- valid bullet
3+ valid bullet

Below is a short example of a nested unordered list.

1+ Lorem ipsum dolor sit amet
2+ Nulla volutpat aliquam velit
3  - Phasellus iaculis neque
4  - Purus sodales ultricies
5+ Faucibus porta lacus fringilla vel

It is rendered to:

  • Lorem ipsum dolor sit amet
  • Nulla volutpat aliquam velit
    • Phasellus iaculis neque
    • Purus sodales ultricies
  • Faucibus porta lacus fringilla vel

Ordered lists

Here is an example of an ordered list:

11. Lorem ipsum dolor sit amet
22. Consectetur adipiscing elit
33. Integer molestie lorem at massa

It is rendered to:

  1. Lorem ipsum dolor sit amet
  2. Consectetur adipiscing elit
  3. Integer molestie lorem at massa

Mathematical notation

R Markdown documents can be supplemented with mathematical formulas. Two types of formulas are supported: inline formulas, which are inserted into the continuous text, and separate formulas. The LaTeX notation is used for mathematical expressions and formulae.

Inline formulas Inline formulas are placed between two $.

1This is an inline formula $8 + 8 = \frac{32}{2}$

It is rendered to:

This is an inline formula \(8 + 8 = \frac{32}{2}\)

Separated formula Separated formulas (centred in a separate line) are placed between a pair of $$.

1This is a separated formula $$f(x) = \sqrt{x}$$

It is rendered to:

This is a separated formula $$f(x) = \sqrt{x}$$

R Code Chunks

Gnu R code can be embedded in R Markdown documents using code chunks. The Gnu R commands are executed and the results are rendered, which can be text output, tables or graphics. Again, there are two ways to embed chunks of code in R Markdown documents, general and inline.

General General R code chunks can be used to render the R output into documents or to simply display code for illustration. In the next simple example, we output a summary of a data set consisting of two variables.

```{r}
summary(cars)
```

The output of this code chunk is:

##      speed           dist       
##  Min.   : 4.0   Min.   :  2.00  
##  1st Qu.:12.0   1st Qu.: 26.00  
##  Median :15.0   Median : 36.00  
##  Mean   :15.4   Mean   : 42.98  
##  3rd Qu.:19.0   3rd Qu.: 56.00  
##  Max.   :25.0   Max.   :120.00

Inline R Code An example of inline code is: Two plus two equals `r 2 + 2`.

It is rendered to:

Two plus two equals 4.

Embedding figures

In R Markdown documents, images that are already available in digital format can be embedded. This is done with the help of the following R Markdown command, in square brackets is the caption, in normal brackets the path to the image and in curly brackets options for image formatting.

1![Example image](images/OverviewDecomp3.png){width="436"}

The result of the embedding can be seen below.

Example image

Embedding tables

It is also possible to insert table structures into R Markdown documents. A simple example of a table consisting of a header, two columns and two rows can be seen below.

1Table Header  | Second Header
2:------------ | :------------
3Table Cell    | Cell 2
4Cell 3        | Cell 4 

This table is rendered to:

Table Header Second Header
Table Cell Cell 2
Cell 3 Cell 4

Further information sources

A wide range of information on the use of R Markdown documents is freely available on the web. Useful sources are: