Modifying output format from within R Markdown chunk by rmarkdown::output_format_dependency

by
カテゴリ:

This article introduces a new feature from rmarkdown 2.24, output_format_dependency().

R Markdown users use variety of output formats from variety of packages such as html_document, bookdown::git_book, revealjs::revealjs_presentation, and so on

Usually, users specify the YAML frontmatter to choose and tweak formats.

output:
  html_document:
    toc: true

Some people may be surprised, but the output formats are R functions! And the above example is equivalent to specifying the toc argument to html_document(). Output formats already provide customizibility.

But what if we want further customizations?

Then, output_format_dependency() is for you. This new feature allows you to modify output format from within chunk.

The example below adds a lua filter and a post processor. You can try it simply by copy and paste it to a chunk of your Rmd files.

rmarkdown::output_format_dependency(
  name = "example",
  pre_processor = function(...) {
    # add lua filter to automatically add line numbers to CodeBlock
    filterfile <- tempfile()
    filterurl <- "https://raw.githubusercontent.com/atusy/lua-filters/master/lua/auto-number-lines.lua"
    download.file(filterurl, filterfile)
    rmarkdown::pandoc_lua_filter_args(filterfile)
  },
  post_processor = function(metadata, input_file, output_file, ...) {
    # make a copy suffixed by timestamp
    output_file2 <- sub(
      "([.][^.]+)$",
      paste0("-", Sys.time(), "\\1"),
      output_file
    )
    file.copy(output_file, output_file2, overwrite = TRUE)
    output_file
  }
)

If you have an experience with defining custom output formats, you probably notice that output_format_dependency() and output_format() shares most of the arguments. output_format_dependency() can do almost evertyhing the output_format() does. Exception is the knitr argument in output_format() because output_format_dependency() is declared during knit and merged to original output format after knit.

Compared to output_format(), output_format_dependency() has advantages for both users and R package developers because output_format_dependency() works from within chunk.

For users, the advantage is that they do not have to dive into developer’s world. There are several ways to use output_format(), but all are slightly complex: writing .Rprofile, writing an R script, or writing an R package, and so on. If users want to re-use the output_format_dependency() in multiple Rmd files, then factor out the Rmd file and include it by other documents by child chunk option.

For R package developers, the advantage is that they do not have to define multiple output formats just for sharing the same dependency. The previous example has a post processor that makes a copy of the output file suffixed by timestamp. This can be considered as a format-agnostic feature, however, the usage of output_format() means defining timestamped_html_document(), timestamped_pdf_document(), timestamped_word_document(), and so on. This is a nightmare. Instead, by using output_format_dependency(), developers just have to export a function, and let users use a function in Rmd files.

The previous example merges the dependency to the output format by implicit knitr::knit_print(). This is a convenient way to define a document-specific dependency. However, when packaging, this is not a robust way because there is no guarantee that the dependency is knitr::knit_printed. In such a case, do explicit merging by knitr::knit_meta_add() like a following example implementing cross reference of sections.

---
title: cross reference of sections
output: html_document
---

```{cat, engine.opts=list(file = "example.lua")}
-- lua filter

local headers = {}

local function collect_header(header)
  headers["#" .. header.identifier] = header
end

local function link_header(link)
  if #link.content == 0 and headers[link.target] then
    link.content = headers[link.target].content
  end
  return link
end

return {
  { Header = collect_header },
  { Link = link_header },
}
```

```{r}
ref <- function(id) {
  dep <- rmarkdown::output_format_dependency(
    name = "example",
    pandoc = list(lua_filters = "example.lua"),
  )

  # dep is merged to output_format without relying on knit_print
  knitr::knit_meta_add(list(dep))

  return(sprintf("[](#%s)", id))
}
```

# awesome section {#awesome} 

see `r ref('brilliant')`

# brilliant section {#brilliant}

see `r ref('awesome')`

Enjoy R Markdown!!

Call for sponsorships

I deeply appreciate my sponsors for supporting my contributions (https://github.com/sponsors/atusy).

If you liked my contributions and started sponsorship, I would be more than happy! My contributions are slow these days because I am using R less these days. In such a situation, your sponsorship really motivates me contributing to R community.