Reproducible {distill} posts with {renv} profiles

Diagram showing two squares representing blog posts, each inside separate squares representing {renv} profiles, both inside a larger rectangle representing the blog profile.

tl;dr

I think you can use the {renv} package to create separate reproducible environment profiles for each of your {distill} blog posts.

Profiled

Functionality comes and goes in R packages. How do you deal with that in the context of a blog built with R? What if you need to go back and change something in a post from four years ago?1

I built a demo {distill} blog to test whether the {renv} package might be a viable solution for reproducibility on a post-by-post basis.

{renv} is a package by Kevin Ushey that records your dependencies in a text ‘lockfile’. It typically works on the scale of a whole project, but since version 0.13.0 you can have multiple profiles within a given project.

I think this means that each post can have its own profile with its own distinct set of packages and package versions.

That means you can easily recreate a specific environment for a given post at a given time if you need to alter and re-render it in future.

Example

I’m presenting this here as a theory, really, but I’ve also made a demo blog to try it out. It seems to work.

There are two posts on the demo blog. They both use the {dplyr} package, but one depends on an old version (0.8.5) and one depends on the current version (1.0.8).

Using {renv} profiles means that these package versions don’t interfere with each other.

The post depending on the older {dplyr} version can’t access the across() function, but the post depending on the newer {dplyr} version can use across().

In other words, the environments associated with the profiles for each post are totally isolated from each other.

How to

Of course, you first need a blog. I used {distill}2 for the demo, a package by JJ Allaire, Rich Iannone, Alison Presmanes Hill and Yihui Xie. You can follow the guidance from RStudio, but basically:

  1. Create your blog with distill::create_blog()
  2. Build it with rmarkdown::render_site() (or ‘Build Website’ from the Build pane of RStudio)
  3. Initiate a reproducible environment for the blog as a whole with renv::init()

And then a new-post workflow could look like this:

  1. Create a new post with distill_create_post()
  2. Activate a profile for the new post with renv::activate(), providing a unique name to the profile argument (I suggest the post’s folder name as seen in the blog’s _posts/ folder)
  3. Install the packages you need for the post with renv::install()
  4. Capture the dependencies in the profile’s lockfile with renv::snapshot()

In code, that might look a bit like this:

distill::create_post("new-post")

renv::activate(profile = "YYYY-MM-DD-new-post")

renv::install(
  "distill",
  "rmarkdown",
  "palmerpenguins",
  "dplyr"
)

renv::snapshot()

For the demo blog, I called the two profiles ‘2022-03-14-dplyr-085’ and ‘2022-03-14-dplyr-108’, which you can see in the renv/profiles/ folder of the project repo.

These are named uniquely for the two separate folders in the _posts/ directory that contain each post’s files. This naming structure should make it easy to remember the profile associated with each post.

As I worked on the posts, I switched between the two profiles with renv::activate(), passing the relevant profile name to the profile argument.

Note that passing NULL as the profile argument means you switch to the default profile associated with the project as a whole, i.e. when you ran renv::init().

Yeah, but?

There are obvious pros and cons to this approach.

For example, maybe it’s a bit too dependent on the user: they have to remember to switch between the profiles, etc.

And I don’t think you can properly rebuild the site again with rmarkdown::render_site(), because this function will run based only the currently active {renv} profile, rather than rendering each post in the context of its own specific profile.

But ultimately isn’t it worthwhile to be able to rebuild a post in future if you need to change or update something? Maybe.

I’d be interested to hear other criticisms, especially before I try and use this approach for real.

Meanwhile, I know that Danielle Navarro has approached this with a more thought-out and sophisticated approach and has created a work-in-progress package called {refinery} to help build a separate environment for each post in a {distill} blog.

In general, Danielle’s blog does a brilliant job of explaining the problem of blog reproducibility and the technicals behind it. I suggest you read that post if you want to know more.


Session info
## ─ Session info ───────────────────────────────────────────────────────────────
##  setting  value                       
##  version  R version 4.1.0 (2021-05-18)
##  os       macOS Big Sur 10.16         
##  system   x86_64, darwin17.0          
##  ui       X11                         
##  language (EN)                        
##  collate  en_GB.UTF-8                 
##  ctype    en_GB.UTF-8                 
##  tz       Europe/London               
##  date     2022-03-18                  
## 
## ─ Packages ───────────────────────────────────────────────────────────────────
##  package     * version date       lib source        
##  blogdown      1.4     2021-07-23 [1] CRAN (R 4.1.0)
##  bookdown      0.23    2021-08-13 [1] CRAN (R 4.1.0)
##  bslib         0.3.1   2021-10-06 [1] CRAN (R 4.1.0)
##  cli           3.2.0   2022-02-14 [1] CRAN (R 4.1.2)
##  digest        0.6.29  2021-12-01 [1] CRAN (R 4.1.0)
##  evaluate      0.14    2019-05-28 [1] CRAN (R 4.1.0)
##  fastmap       1.1.0   2021-01-25 [1] CRAN (R 4.1.0)
##  htmltools     0.5.2   2021-08-25 [1] CRAN (R 4.1.0)
##  jquerylib     0.1.4   2021-04-26 [1] CRAN (R 4.1.0)
##  jsonlite      1.7.3   2022-01-17 [1] CRAN (R 4.1.2)
##  knitr         1.37    2021-12-16 [1] CRAN (R 4.1.0)
##  magrittr      2.0.2   2022-01-26 [1] CRAN (R 4.1.2)
##  R6            2.5.1   2021-08-19 [1] CRAN (R 4.1.0)
##  rlang         1.0.2   2022-03-04 [1] CRAN (R 4.1.2)
##  rmarkdown     2.10    2021-08-06 [1] CRAN (R 4.1.0)
##  rstudioapi    0.13    2020-11-12 [1] CRAN (R 4.1.0)
##  sass          0.4.0   2021-05-12 [1] CRAN (R 4.1.0)
##  sessioninfo   1.1.1   2018-11-05 [1] CRAN (R 4.1.0)
##  stringi       1.7.6   2021-11-29 [1] CRAN (R 4.1.0)
##  stringr       1.4.0   2019-02-10 [1] CRAN (R 4.1.0)
##  withr         2.4.3   2021-11-30 [1] CRAN (R 4.1.0)
##  xfun          0.29    2021-12-14 [1] CRAN (R 4.1.0)
##  yaml          2.2.2   2022-01-25 [1] CRAN (R 4.1.2)
## 
## [1] /Library/Frameworks/R.framework/Versions/4.1/Resources/library

  1. Yes, I’m thinking about this because this blog is nearly four years old and I’ve had some headaches trying to rebuild posts from that long ago.↩︎

  2. This site is built with {blogdown} rather than {distill}, so I’m using this post as a chance to learn a bit more about it. {distill} has also become quite popular in the R community, so it may be helpful for a wider readership if I use it in this demo.↩︎