Reproducible data science with WebAssembly and webR

George Stagg

Posit, PBC

Reproducibility in Data Science

A large-scale study on research code quality and execution

We find that 74% of R files failed to complete without error in the initial execution.

At the language level

It works on my machine Β―\(ツ)/Β―

  • Hard-coded paths, setwd(), project organisation.

  • Source code management, git, GitHub.

  • Defensive programming, handling error conditions.

  • Organising software into modules or packages.

  • Documentation and tests.

Hex logos: pkgdown, testthat, targets, roxygen2, box, usethis, shinytest2

A small effort here, even using automated tools, makes a big difference.

56% failed when code cleaning was applied, showing that many errors can be prevented with good coding practices.

Computing environments

Many different computational environments exist, all with the potential to affect your analysis.

Language and package management

  • Versions of interpreter software:
    R 3.6.3, 4.1.3, 4.4.1

  • Versions of packages

System and library management

  • System libraries: GSL, NLopt, BLAS/LAPACK
  • Operating system: Windows, macOS, Linux

Tools: rig, pak, renv, Virtual Machines, Docker, Nix/Rix

  • Very slow to reproduce environment in full, especially without caching!
  • Difficult to use, very steep learning curve.

Binary-level differences

The same software can give different binary output depending on the type of hardware
(e.g. ARM vs x86_64 vs RISC)

WebAssembly

  • A portable binary code format
  • Enables high-performance applications on web pages
  • Near-native execution speed
  • Supported by most modern browsers
  • Interactive through JavaScript integration

Also provides benefits for security in the form of containerisation and sandboxing.

R for WebAssembly: webR

The webR logo

The webR project is a version of the R interpreter built for WebAssembly.

Execute R code directly in a web browser, without a supporting R server. Alternatively, run an R process server-side using Node.js

Available on GitHub and NPM as a JavaScript & TypeScript library.

Reproducibility by default

R Consortium Submission Working Group

Screenshot of the pilot shiny app submission using webR

πŸ”— Testing Containers and WebAssembly in Submissions to the FDA - pharmaverse.github.io

Traditional Shiny App

Shinylive App πŸ”— https://shinylive.io/r/

Shinylive in Quarto

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor
incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam.

```{shinylive-r}
#| standalone: true
library(shiny)

# Create Shiny UI
ui <- [...]

# Create Shiny server function
server <- function(input, output, session) {
  [...]
}

# Build Shiny app
shinyApp(ui = ui, server = server)
```

Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu
fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in
culpa qui officia deserunt mollit laborum.

Shinylive for R

#| standalone: true
#| viewerHeight: 700

library(shiny)
library(bslib)

theme <- bs_theme(font_scale = 1.5)

# Define UI for app that draws a histogram ----
ui <- page_sidebar(theme = theme,
  sidebar = sidebar(open = "open",
    numericInput("n", "Sample count", 50),
    checkboxInput("pause", "Pause", FALSE),
  ),
  plotOutput("plot", width=1100)
)

server <- function(input, output, session) {
  data <- reactive({
    input$resample
    if (!isTRUE(input$pause)) {
      invalidateLater(1000)
    }
    rnorm(input$n)
  })
  
  output$plot <- renderPlot({
    hist(data(),
      breaks = 30,
      xlim = c(-2, 2),
      ylim = c(0, 1),
      xlab = "value",
      freq = FALSE,
      main = ""
    )
    
    x <- seq(from = -2, to = 2, length.out = 500)
    y <- dnorm(x)
    lines(x, y, lwd=1.5)
    
    lwd <- 5
    abline(v=0, col="red", lwd=lwd, lty=2)
    abline(v=mean(data()), col="blue", lwd=lwd, lty=1)

    legend(legend = c("Normal", "Mean", "Sample mean"),
      col = c("black", "red", "blue"),
      lty = c(1, 2, 1),
      lwd = c(1, lwd, lwd),
      x = 1,
      y = 0.9
    )
  }, res=140)
}

# Create Shiny app ----
shinyApp(ui = ui, server = server)

Convert a Shiny app to Shinylive

Install the Shinylive R package:

install.packages("shinylive")

Convert the app:

shinylive::export("myapp", "site")

Binary bundle ready to transfer to another machine or host on a static web service.

httpuv::runStaticServer("site")

WebAssembly R packages

Binary R packages for Wasm are available from a CRAN-like CDN:

rwasm package hex logo

What if R packages change?

  • R packages are always updating and changing.

  • Wasm R package binaries will be frozen and bundled with an app automatically.

  • Apps will continue to work in the future, even as Wasm package repositories update.

Future work and current issues

  • Not all R packages work under WebAssembly.

  • Building custom R packages using GitHub Actions is still experimental.

  • There will always be good reasons to use a traditional Shiny deployment.

  • Browser security restrictions: limited networking, no raw socket access.

  • 😱 There are no secrets with a Shinylive app!

  • All code and data is sent to the client, deploy accordingly.


WebR demo website

https://webr.r-wasm.org/v0.4.0/

Shinylive examples

https://shinylive.io/r/

Documentation

https://docs.r-wasm.org/webr/v0.4.0/

https://github.com/posit-dev/shinylive

https://github.com/quarto-ext/shinylive