R Notebook HTML Format

Overview

This article describes the HTML notebook format, and is primarily intended for front-end applications using or embedding R, or other users who are interested in reading and writing documents using the R Notebook format.

R Notebooks are HTML documents with data written and encoded in such a way that:

The source .Rmd document can be recovered, and
Chunk outputs can be recovered.

To generate an R Notebook, you can use rmarkdown::render() and specify the html_notebook output format in your document’s YAML metadata. Documents rendered in this form will be generated with the .nb.html file extension, to indicate that they are HTML notebooks.

To ensure chunk outputs can be recovered, the elements of the R Markdown document are enclosed with HTML comments, providing more information on the output. For example, chunk output might be serialized as:

<!-- rnb-chunk-begin -->
<!-- rnb-output-begin -->
<pre><code>Hello, World!</code></pre>
<!-- rnb-output-end -->
<!-- rnb-chunk-end -->

Because R Notebooks are just HTML documents, they can be opened and viewed in any web browser; in addition, hosting environments can be configured to recover and open the source .Rmd document, and also recover and display chunk outputs as appropriate.

Generating R Notebooks with Custom Output

It’s possible to render an HTML notebook with custom chunk outputs inserted in lieu of the result that would be generated by evaluating the associated R code. This can be useful for front-end editors that show the output of chunk execution inline, or for conversion programs from other notebook formats where output is already available from the source format. To facilitate this, one can provide a custom ‘output source’ to rmarkdown::render(). Let’s investigate with a simple example:

rmd_stub <- "notebook/r-notebook-stub.Rmd"
contents <- readLines(rmd_stub)
cat(contents, sep = "\n")

## ---
## title: "R Notebook Stub"
## output: html_notebook
## ---
## 
## ```{r chunk-one}
## print("Hello, World!")
## ```

Let’s try rendering this document with a custom output source, so that we can inject custom output for the single chunk within the document. The output source function will accept:

code: The code within the current chunk,
context: An environment containing active chunk options and other chunk information,
...: Optional arguments reserved for future expansion.

In particular, the context elements label and chunk.index can be used to help identify which chunk is currently being rendered.

output_source <- function(code, context, ...) {
  logo <- file.path(R.home(), "doc/html/logo.jpg")
  if (context$label == "chunk-one") {
    return(list(
      rmarkdown::html_notebook_output_code("# R Code"),
      paste("Custom output for chunk:", context$chunk.index),
      rmarkdown::html_notebook_output_code("# R Logo"),
      rmarkdown::html_notebook_output_img(logo)
    ))
  }
}

We can pass our output_source along as part of the output_options list to rmarkdown::render().

output_file <-
  rmarkdown::render(rmd_stub,
                    output_options = list(output_source = output_source),
                    quiet = TRUE)

We’ve now generated an R Notebook. Opening this document in a web browser will show that the output_source function has effectively side-stepped evaluation of code within that chunk, and instead returned the injected result:

Implementing Output Sources

In general, you can provide regular R output in your output source function, but rmarkdown also provides a number of endpoints for insertion of custom HTML content. These are documented within ?html_notebook_output.

Using these functions ensures that you produce an R Notebook that can be opened in R frontends (e.g. RStudio).

Parsing R Notebooks

The rmarkdown::parse_html_notebook() function provides an interface for recovering and parsing an HTML notebook.

parsed <- rmarkdown::parse_html_notebook(output_file)
str(parsed)

## List of 4
##  $ source     : chr [1:1835] "<!DOCTYPE html>" "" "<html>" "" ...
##  $ rmd        : chr [1:8] "---" "title: \"R Notebook Stub\"" "output: html_notebook" "---" ...
##  $ header     : chr [1:1721] "<head>" "" "<meta charset=\"utf-8\" />" "<meta name=\"generator\" content=\"pandoc\" />" ...
##  $ annotations:List of 12
##   ..$ :List of 4
##   .. ..$ row  : int 1754
##   .. ..$ label: chr "text"
##   .. ..$ state: chr "begin"
##   .. ..$ meta : NULL
##   ..$ :List of 4
##   .. ..$ row  : int 1755
##   .. ..$ label: chr "text"
##   .. ..$ state: chr "end"
##   .. ..$ meta : NULL
##   ..$ :List of 4
##   .. ..$ row  : int 1756
##   .. ..$ label: chr "chunk"
##   .. ..$ state: chr "begin"
##   .. ..$ meta : NULL
##   ..$ :List of 4
##   .. ..$ row  : int 1757
##   .. ..$ label: chr "source"
##   .. ..$ state: chr "begin"
##   .. ..$ meta :List of 1
##   .. .. ..$ data: chr "```r\n# R Code\n```"
##   ..$ :List of 4
##   .. ..$ row  : int 1759
##   .. ..$ label: chr "source"
##   .. ..$ state: chr "end"
##   .. ..$ meta : NULL
##   ..$ :List of 4
##   .. ..$ row  : int 1760
##   .. ..$ label: chr "output"
##   .. ..$ state: chr "begin"
##   .. ..$ meta :List of 1
##   .. .. ..$ data: chr "Custom output for chunk: 1\n"
##   ..$ :List of 4
##   .. ..$ row  : int 1762
##   .. ..$ label: chr "output"
##   .. ..$ state: chr "end"
##   .. ..$ meta : NULL
##   ..$ :List of 4
##   .. ..$ row  : int 1763
##   .. ..$ label: chr "source"
##   .. ..$ state: chr "begin"
##   .. ..$ meta :List of 1
##   .. .. ..$ data: chr "```r\n# R Logo\n```"
##   ..$ :List of 4
##   .. ..$ row  : int 1765
##   .. ..$ label: chr "source"
##   .. ..$ state: chr "end"
##   .. ..$ meta : NULL
##   ..$ :List of 4
##   .. ..$ row  : int 1766
##   .. ..$ label: chr "plot"
##   .. ..$ state: chr "begin"
##   .. ..$ meta : NULL
##   ..$ :List of 4
##   .. ..$ row  : int 1768
##   .. ..$ label: chr "plot"
##   .. ..$ state: chr "end"
##   .. ..$ meta : NULL
##   ..$ :List of 4
##   .. ..$ row  : int 1769
##   .. ..$ label: chr "chunk"
##   .. ..$ state: chr "end"
##   .. ..$ meta : NULL

This interface can be used to recover the original .Rmd source, and also (with some more effort from the front-end) the ability to recover chunk outputs from the document itself.