This vignette provides a quick introduction through 3 typical use cases. Other package vignettes contain more in-depth information on a variety of topics.
Use case: minimise execution times by performing long-running tasks concurrently in separate processes.
Multiple long computes (model fits etc.) can be performed in parallel on available computing cores.
Use mirai()
to evaluate an expression asynchronously in a separate, clean R process.
The following mimics an expensive calculation that eventually returns a random value.
library(mirai)
args <- list(time = 2L, mean = 4)
m <- mirai(
{
Sys.sleep(time)
rnorm(5L, mean)
},
time = args$time,
mean = args$mean
)
The mirai expression is evaluated in another process and hence must be self-contained, not referring to variables that do not already exist there.
Above, the variables time
and mean
are passed as part of the mirai()
call.
A ‘mirai’ object is returned immediately - creating a mirai never blocks the session.
Whilst the async operation is ongoing, attempting to access a mirai’s data yields an ‘unresolved’ logical NA.
m
#> < mirai [] >
m$data
#> 'unresolved' logi NA
To check whether a mirai remains unresolved (yet to complete):
unresolved(m)
#> [1] TRUE
To wait for and collect the return value, use collect_mirai()
or equivalently the mirai’s []
method:
collect_mirai(m)
#> [1] 6.223694 2.189400 4.695696 4.807338 6.597992
m[]
#> [1] 6.223694 2.189400 4.695696 4.807338 6.597992
As a mirai represents an async operation, it is never necessary to wait for it - other code can continue to be run.
Once it completes, the return value automatically becomes available at $data
.
m
#> < mirai [$data] >
m$data
#> [1] 6.223694 2.189400 4.695696 4.807338 6.597992
For easy programmatic use of mirai()
, ‘.expr’ accepts a pre-constructed language object, and also a list of named arguments passed via ‘.args’.
So, the following would be equivalent to the above:
expr <- quote({Sys.sleep(time); rnorm(5L, mean)})
m <- mirai(.expr = expr, .args = args)
m[]
#> [1] 4.703916 1.963779 4.930849 5.174166 3.128183
Use case: ensure execution flow of the main process is not blocked.
High-frequency real-time data cannot be written to file/database synchronously without disrupting the execution flow.
Cache data in memory and use mirai()
to perform periodic write operations concurrently in a separate process.
Below, ‘.args’ is used to pass environment()
, which is the calling environment.
This provides a convenient method of passing in existing objects.
library(mirai)
x <- rnorm(1e6)
file <- tempfile()
m <- mirai(write.csv(x, file = file), .args = environment())
A ‘mirai’ object is returned immediately.
unresolved()
may be used in control flow statements to perform actions which depend on resolution of the ‘mirai’, both before and after.
This means there is no need to actually wait (block) for a ‘mirai’ to resolve, as the example below demonstrates.
while (unresolved(m)) {
cat("while unresolved\n")
Sys.sleep(0.5)
}
#> while unresolved
#> while unresolved
cat("Write complete:", is.null(m$data))
#> Write complete: TRUE
Now actions which depend on the resolution may be processed, for example the next write.
Use case: isolating code that can potentially fail in a separate process to ensure continued uptime.
As part of a data science / machine learning pipeline, iterations of model training may periodically fail for stochastic and uncontrollable reasons (e.g. buggy memory management on graphics cards).
Running each iteration in a ‘mirai’ isolates this potentially-problematic code such that it does not bring down the entire pipeline, even if it fails.
library(mirai)
run_iteration <- function(i) {
# simulates a stochastic error rate
if (runif(1) < 0.1) stop("random error\n", call. = FALSE)
sprintf("iteration %d successful\n", i)
}
for (i in 1:10) {
m <- mirai(run_iteration(i), environment())
while (is_error_value(m[])) {
cat(m$data)
m <- mirai(run_iteration(i), environment())
}
cat(m$data)
}
#> iteration 1 successful
#> iteration 2 successful
#> iteration 3 successful
#> iteration 4 successful
#> Error: random error
#> iteration 5 successful
#> iteration 6 successful
#> iteration 7 successful
#> iteration 8 successful
#> iteration 9 successful
#> Error: random error
#> iteration 10 successful
Further, by testing the return value of each ‘mirai’ for errors, error-handling code is then able to automate recovery and re-attempts, as in the above example.
The end result is a resilient and fault-tolerant pipeline that minimises downtime by eliminating interruptions of long computes.
If execution in a mirai fails, the error message is returned as a character string of class ‘miraiError’ and ‘errorValue’ to facilitate debugging.
is_mirai_error()
may be used to test for mirai execution errors.
m1 <- mirai(stop("occurred with a custom message", call. = FALSE))
m1[]
#> 'miraiError' chr Error: occurred with a custom message
m2 <- mirai(mirai::mirai())
m2[]
#> 'miraiError' chr Error in mirai::mirai(): missing expression, perhaps wrap in {}?
is_mirai_error(m2$data)
#> [1] TRUE
is_error_value(m2$data)
#> [1] TRUE
A full stack trace of evaluation within the mirai is recorded and accessible at $stack.trace
on the error object.
f <- function(x) if (x > 0) stop("positive")
m3 <- mirai({f(-1); f(1)}, f = f)
m3[]
#> 'miraiError' chr Error in f(1): positive
m3$data$stack.trace
#> [[1]]
#> stop("positive")
#>
#> [[2]]
#> f(1)
Elements of the original error condition are also accessible via $
on the error object.
For example, additional metadata recorded by rlang::abort()
is preserved:
f <- function(x) if (x > 0) stop("positive")
m4 <- mirai(rlang::abort("aborted", meta_uid = "UID001"))
m4[]
#> 'miraiError' chr Error: aborted
m4$data$meta_uid
#> [1] "UID001"
If a daemon instance is sent a user interrupt, the mirai will resolve to an object of class ‘miraiInterrupt’ and ‘errorValue’.
is_mirai_interrupt()
may be used to test for such interrupts.
m4 <- mirai(rlang::interrupt()) # simulates a user interrupt
is_mirai_interrupt(m4[])
#> [1] TRUE
If execution of a mirai surpasses the timeout set via the ‘.timeout’ argument, the mirai will resolve to an ‘errorValue’ of 5L (timed out). This can, amongst other things, guard against mirai processes that have the potential to hang and never return.
m5 <- mirai(nanonext::msleep(1000), .timeout = 500)
m5[]
#> 'errorValue' int 5 | Timed out
is_mirai_error(m5$data)
#> [1] FALSE
is_mirai_interrupt(m5$data)
#> [1] FALSE
is_error_value(m5$data)
#> [1] TRUE
is_error_value()
tests for all mirai execution errors, user interrupts and timeouts.