| Title: | Likelihood-Free Parameter Estimation using Neural Networks |
|---|---|
| Description: | An 'R' interface to the 'Julia' package 'NeuralEstimators.jl'. The package facilitates the user-friendly development of neural Bayes estimators, which are neural networks that map data to a point summary of the posterior distribution (Sainsbury-Dale et al., 2024, <doi:10.1080/00031305.2023.2249522>). These estimators are likelihood-free and amortised, in the sense that, once the neural networks are trained on simulated data, inference from observed data can be made in a fraction of the time required by conventional approaches. The package also supports amortised Bayesian or frequentist inference using neural networks that approximate the posterior or likelihood-to-evidence ratio (Zammit-Mangion et al., 2025, Sec. 3.2, 5.2, <doi:10.48550/arXiv.2404.12484>). The package accommodates any model for which simulation is feasible by allowing users to define models implicitly through simulated data. |
| Authors: | Matthew Sainsbury-Dale [aut, cre] |
| Maintainer: | Matthew Sainsbury-Dale <[email protected]> |
| License: | GPL (>= 2) |
| Version: | 0.2.1 |
| Built: | 2026-05-28 06:39:08 UTC |
| Source: | https://github.com/msainsburydale/neuralestimators |
assess a neural estimator
assess(estimator, parameters, Z, ...)assess(estimator, parameters, Z, ...)
estimator |
a neural estimator (or a list of neural estimators) |
parameters |
true parameters, stored as a |
Z |
data simulated conditionally on the |
... |
additional keyword arguments passed to the Julia version of |
a list of two data frames: runtimes contains the
total time taken for each estimator, while df is a long-form
data frame with columns:
"parameter"; the name of the parameter
"truth"; the true value of the parameter
"estimate"; the estimated value of the parameter
"k"; the index of the parameter vector in the test set
"j"; the index of the data set
risk(), rmse(), bias(), plotestimates(), and plotdistribution() for computing various empirical diagnostics and visualisations from an object returned by assess()
computes a Monte Carlo approximation of an estimator's bias
bias(assessment, ...)bias(assessment, ...)
assessment |
an object returned by |
... |
optional arguments inherited from |
a dataframe giving the estimated bias
Compute bootstrap estimates from a neural point estimator
bootstrap(estimator, Z, B = 400, blocks = NULL, use_gpu = TRUE)bootstrap(estimator, Z, B = 400, blocks = NULL, use_gpu = TRUE)
estimator |
a neural point estimator |
Z |
either a list of data sets simulated conditionally on the fitted parameters (parametric bootstrap); or a single observed data set containing independent replicates, which will be sampled with replacement |
B |
number of non-parametric bootstrap samples |
blocks |
integer vector specifying the blocks in non-parameteric bootstrap. For example, with 5 replicates, the first two corresponding to block 1 and the remaining three corresponding to block 2, |
use_gpu |
boolean indicating whether to use the GPU if it is available |
d × B matrix, where d is the dimension of the parameter vector
For data Z with missing (NA) entries, computes an augmented data set (U, W) where W encodes the missingness pattern as an indicator vector and U is the original data Z with missing entries replaced by a fixed constant c.
encodedata(Z, c = 0)encodedata(Z, c = 0)
Z |
data containing |
c |
fixed constant with which to replace |
Augmented data set (U, W). If Z is provided as a list, the return type will be a JuliaProxy object; these objects can be indexed in the usual manner using [[, or converted to an R object using juliaGet() (note however that juliaGet() can be slow for large data sets).
the Julia version of encodedata()
## Not run: library("NeuralEstimators") Z <- matrix(c(1, 2, NA, NA, 5, 6, 7, NA, 9), nrow = 3) encodedata(Z) encodedata(list(Z, Z)) ## End(Not run)## Not run: library("NeuralEstimators") Z <- matrix(c(1, 2, NA, NA, 5, 6, 7, NA, 9), nrow = 3) encodedata(Z) encodedata(list(Z, Z)) ## End(Not run)
Apply a neural Bayes estimator to data
estimate(estimator, Z, X = NULL, batchsize = 32, ...)estimate(estimator, Z, X = NULL, batchsize = 32, ...)
estimator |
a neural estimator that can be applied to data in a call of the form |
Z |
data in a format amenable to the neural-network architecture of |
X |
additional inputs to the neural network; if provided, the call will be of the form |
batchsize |
the batch size for applying |
... |
additional keyword arguments passed to the Julia version of |
a matrix of outputs resulting from applying estimator to Z (and possibly X)
sampleposterior() for making inference with neural posterior or likelihood-to-evidence-ratio estimators
Load a saved state of a neural estimator (e.g., optimised neural-network parameters). Useful for amortised inference, whereby a neural network is trained once and then used repeatedly to make inference with new data sets.
loadstate(estimator, filename)loadstate(estimator, filename)
estimator |
the neural estimator that we wish to load the state into |
filename |
file name (including path) of the neural-network state stored in a |
estimator updated with the saved state
Apply a neural ratio estimator to data
logratio(estimator, Z, grid, batchsize = 32, ...)logratio(estimator, Z, grid, batchsize = 32, ...)
estimator |
a neural ratio estimator |
Z |
data in a format amenable to the neural-network architecture of |
grid |
matrix of parameter values, where each column is a parameter configuration |
batchsize |
the batch size |
... |
additional keyword arguments passed to the Julia version of |
A matrix of log ratios with one row per data set and one column per grid point
sampleposterior() for making posterior inferences
Plot the empirical sampling distribution of an estimator.
plotdistribution( df, type = c("box", "density", "scatter"), parameter_labels = NULL, estimator_labels = ggplot2::waiver(), truth_colour = "black", truth_size = 8, truth_line_size = NULL, pairs = FALSE, upper_triangle_plots = NULL, legend = TRUE, return_list = FALSE, flip = FALSE )plotdistribution( df, type = c("box", "density", "scatter"), parameter_labels = NULL, estimator_labels = ggplot2::waiver(), truth_colour = "black", truth_size = 8, truth_line_size = NULL, pairs = FALSE, upper_triangle_plots = NULL, legend = TRUE, return_list = FALSE, flip = FALSE )
df |
a long form data frame containing fields |
type |
string indicating whether to plot kernel density estimates for each individual parameter ( |
parameter_labels |
a named vector containing parameter labels. |
estimator_labels |
a named vector containing estimator labels. |
truth_colour |
the colour used to denote the true parameter value. |
truth_size |
the size of the point used to denote the true parameter value (applicable only for |
truth_line_size |
the size of the cross-hairs used to denote the true parameter value. If |
pairs |
logical; should we combine the scatter plots into a single pairs plot (applicable only for |
upper_triangle_plots |
an optional list of plots to include in the uppertriangle of the pairs plot. |
legend |
Flag; should we include the legend (only applies when constructing a pairs plot) |
return_list |
Flag; should the parameters be split into a list? |
flip |
Flag; should the boxplots be "flipped" using |
a list of 'ggplot' objects or, if pairs = TRUE, a single 'ggplot'.
## Not run: # In the following, we have two estimators and, for each parameter, 50 estimates # from each estimator. estimators <- c("Estimator 1", "Estimator 2") estimator_labels <- c("Estimator 1" = expression(hat(theta)[1]("·")), "Estimator 2" = expression(hat(theta)[2]("·"))) # Single parameter: df <- data.frame( estimator = estimators, truth = 0, parameter = "mu", estimate = rnorm(2*50), replicate = rep(1:50, each = 2) ) parameter_labels <- c("mu" = expression(mu)) plotdistribution(df) plotdistribution(df, type = "density") plotdistribution(df, parameter_labels = parameter_labels, estimator_labels = estimator_labels) # Two parameters: df <- rbind(df, data.frame( estimator = estimators, truth = 1, parameter = "sigma", estimate = rgamma(2*50, shape = 1, rate = 1), replicate = rep(1:50, each = 2) )) parameter_labels <- c(parameter_labels, "sigma" = expression(sigma)) plotdistribution(df, parameter_labels = parameter_labels) plotdistribution(df, parameter_labels = parameter_labels, type = "density") plotdistribution(df, parameter_labels = parameter_labels, type = "scatter") # Three parameters: df <- rbind(df, data.frame( estimator = estimators, truth = 0.25, parameter = "alpha", estimate = 0.5 * runif(2*50), replicate = rep(1:50, each = 2) )) parameter_labels <- c(parameter_labels, "alpha" = expression(alpha)) plotdistribution(df, parameter_labels = parameter_labels) plotdistribution(df, parameter_labels = parameter_labels, type = "density") plotdistribution(df, parameter_labels = parameter_labels, type = "scatter") plotdistribution(df, parameter_labels = parameter_labels, type = "scatter", pairs = TRUE) # Pairs plot with user-specified plots in the upper triangle: upper_triangle_plots <- lapply(1:3, function(i) { x = rnorm(10) y = rnorm(10) shape = sample(c("Class 1", "Class 2"), 10, replace = TRUE) ggplot() + geom_point(aes(x = x, y = y, shape = shape)) + labs(shape = "") + theme_bw() }) plotdistribution( df, parameter_labels = parameter_labels, estimator_labels = estimator_labels, type = "scatter", pairs = TRUE, upper_triangle_plots = upper_triangle_plots ) ## End(Not run)## Not run: # In the following, we have two estimators and, for each parameter, 50 estimates # from each estimator. estimators <- c("Estimator 1", "Estimator 2") estimator_labels <- c("Estimator 1" = expression(hat(theta)[1]("·")), "Estimator 2" = expression(hat(theta)[2]("·"))) # Single parameter: df <- data.frame( estimator = estimators, truth = 0, parameter = "mu", estimate = rnorm(2*50), replicate = rep(1:50, each = 2) ) parameter_labels <- c("mu" = expression(mu)) plotdistribution(df) plotdistribution(df, type = "density") plotdistribution(df, parameter_labels = parameter_labels, estimator_labels = estimator_labels) # Two parameters: df <- rbind(df, data.frame( estimator = estimators, truth = 1, parameter = "sigma", estimate = rgamma(2*50, shape = 1, rate = 1), replicate = rep(1:50, each = 2) )) parameter_labels <- c(parameter_labels, "sigma" = expression(sigma)) plotdistribution(df, parameter_labels = parameter_labels) plotdistribution(df, parameter_labels = parameter_labels, type = "density") plotdistribution(df, parameter_labels = parameter_labels, type = "scatter") # Three parameters: df <- rbind(df, data.frame( estimator = estimators, truth = 0.25, parameter = "alpha", estimate = 0.5 * runif(2*50), replicate = rep(1:50, each = 2) )) parameter_labels <- c(parameter_labels, "alpha" = expression(alpha)) plotdistribution(df, parameter_labels = parameter_labels) plotdistribution(df, parameter_labels = parameter_labels, type = "density") plotdistribution(df, parameter_labels = parameter_labels, type = "scatter") plotdistribution(df, parameter_labels = parameter_labels, type = "scatter", pairs = TRUE) # Pairs plot with user-specified plots in the upper triangle: upper_triangle_plots <- lapply(1:3, function(i) { x = rnorm(10) y = rnorm(10) shape = sample(c("Class 1", "Class 2"), 10, replace = TRUE) ggplot() + geom_point(aes(x = x, y = y, shape = shape)) + labs(shape = "") + theme_bw() }) plotdistribution( df, parameter_labels = parameter_labels, estimator_labels = estimator_labels, type = "scatter", pairs = TRUE, upper_triangle_plots = upper_triangle_plots ) ## End(Not run)
Plot estimates vs. true values.
plotestimates( df, estimator_labels = ggplot2::waiver(), parameter_labels = NULL )plotestimates( df, estimator_labels = ggplot2::waiver(), parameter_labels = NULL )
df |
a long form data frame containing fields |
estimator_labels |
a named vector containing estimator labels. |
parameter_labels |
a named vector containing parameter labels. |
a 'ggplot' of the estimates for each parameter against the true value.
## Not run: K <- 50 df <- data.frame( estimator = c("Estimator 1", "Estimator 2"), parameter = rep(c("mu", "sigma"), each = K), truth = 1:(2*K), estimate = 1:(2*K) + rnorm(4*K) ) estimator_labels <- c("Estimator 1" = expression(hat(theta)[1]("·")), "Estimator 2" = expression(hat(theta)[2]("·"))) parameter_labels <- c("mu" = expression(mu), "sigma" = expression(sigma)) plotestimates(df, parameter_labels = parameter_labels, estimator_labels) ## End(Not run)## Not run: K <- 50 df <- data.frame( estimator = c("Estimator 1", "Estimator 2"), parameter = rep(c("mu", "sigma"), each = K), truth = 1:(2*K), estimate = 1:(2*K) + rnorm(4*K) ) estimator_labels <- c("Estimator 1" = expression(hat(theta)[1]("·")), "Estimator 2" = expression(hat(theta)[2]("·"))) parameter_labels <- c("mu" = expression(mu), "sigma" = expression(sigma)) plotestimates(df, parameter_labels = parameter_labels, estimator_labels) ## End(Not run)
computes a Monte Carlo approximation of an estimator's Bayes risk
risk( assessment, loss = function(x, y) abs(x - y), average_over_parameters = FALSE, average_over_sample_sizes = TRUE )risk( assessment, loss = function(x, y) abs(x - y), average_over_parameters = FALSE, average_over_sample_sizes = TRUE )
assessment |
an object returned by |
loss |
a binary operator defining the loss function (default absolute-error loss) |
average_over_parameters |
if |
average_over_sample_sizes |
if |
a dataframe giving an estimate of the Bayes risk
computes a Monte Carlo approximation of an estimator's root-mean-square error (RMSE)
rmse(assessment, ...)rmse(assessment, ...)
assessment |
an object returned by |
... |
optional arguments inherited from |
a dataframe giving the estimated RMSE
Samples from the approximate posterior distribution given data Z.
sampleposterior(estimator, Z, N = 1000, ...)sampleposterior(estimator, Z, N = 1000, ...)
estimator |
a neural posterior or likelihood-to-evidence-ratio estimator |
Z |
data in a format amenable to the neural-network architecture of |
N |
number of approximate posterior samples to draw |
... |
additional keyword arguments passed to the Julia version of |
d × N matrix of posterior samples, where d is the dimension of the parameter vector. If Z contains multiple independent data sets, a list of matrices will be returned
estimate() for making inference with neural Bayes estimators
save the state of a neural estimator
savestate(estimator, filename)savestate(estimator, filename)
estimator |
the neural estimator that we wish to save |
filename |
file in which to save the neural-network state as a |
No return value, called for side effects
Constructs a graph object for use in a graph neural network (GNN).
spatialgraph(S, Z, isotropic = TRUE, stationary = TRUE, ...)spatialgraph(S, Z, isotropic = TRUE, stationary = TRUE, ...)
S |
Spatial locations, provided as:
|
Z |
Spatial data, provided as:
|
isotropic |
Logical. If |
stationary |
Logical. If |
... |
Additional keyword arguments from the Julia function |
A GNNGraph (JuliaProxy object) or, if multiple data sets are provided, a vector of GNNGraph objects which can be indexed in the usual manner using [[ or converted to an R list using a combination of indexing and lapply.
## Not run: library("NeuralEstimators") # Number of replicates m <- 5 # Spatial locations fixed for all replicates n <- 100 S <- matrix(runif(n * 2), n, 2) Z <- matrix(runif(n * m), n, m) g <- spatialgraph(S, Z) # Spatial locations varying between replicates n <- sample(50:100, m, replace = TRUE) S <- lapply(n, function(ni) matrix(runif(ni * 2), ni, 2)) Z <- lapply(n, function(ni) runif(ni)) g <- spatialgraph(S, Z) # Multiple data sets: Spatial locations fixed for all replicates within a given data set K <- 15 # number of data sets n <- sample(50:100, K, replace = TRUE) # number of spatial locations can vary between data sets S <- lapply(1:K, function(k) matrix(runif(n[k] * 2), n[k], 2)) Z <- lapply(1:K, function(k) matrix(runif(n[k] * m), n[k], m)) g <- spatialgraph(S, Z) # Multiple data sets: Spatial locations varying between replicates within a given data set S <- lapply(1:K, function(k) { lapply(1:m, function(i) { ni <- sample(50:100, 1) # randomly generate the number of locations for each replicate matrix(runif(ni * 2), ni, 2) # generate the spatial locations }) }) Z <- lapply(1:K, function(k) { lapply(1:m, function(i) { n <- nrow(S[[k]][[i]]) runif(n) }) }) g <- spatialgraph(S, Z) ## End(Not run)## Not run: library("NeuralEstimators") # Number of replicates m <- 5 # Spatial locations fixed for all replicates n <- 100 S <- matrix(runif(n * 2), n, 2) Z <- matrix(runif(n * m), n, m) g <- spatialgraph(S, Z) # Spatial locations varying between replicates n <- sample(50:100, m, replace = TRUE) S <- lapply(n, function(ni) matrix(runif(ni * 2), ni, 2)) Z <- lapply(n, function(ni) runif(ni)) g <- spatialgraph(S, Z) # Multiple data sets: Spatial locations fixed for all replicates within a given data set K <- 15 # number of data sets n <- sample(50:100, K, replace = TRUE) # number of spatial locations can vary between data sets S <- lapply(1:K, function(k) matrix(runif(n[k] * 2), n[k], 2)) Z <- lapply(1:K, function(k) matrix(runif(n[k] * m), n[k], m)) g <- spatialgraph(S, Z) # Multiple data sets: Spatial locations varying between replicates within a given data set S <- lapply(1:K, function(k) { lapply(1:m, function(i) { ni <- sample(50:100, 1) # randomly generate the number of locations for each replicate matrix(runif(ni * 2), ni, 2) # generate the spatial locations }) }) Z <- lapply(1:K, function(k) { lapply(1:m, function(i) { n <- nrow(S[[k]][[i]]) runif(n) }) }) g <- spatialgraph(S, Z) ## End(Not run)
For k > 0, defines Julia code that defines the loss function,
which approximates the 0-1 loss as k tends to zero.
The resulting string is intended to be used in the function train, but can also be converted to a callable function using juliaEval.
tanhloss(k)tanhloss(k)
k |
Positive numeric value that controls the smoothness of the approximation. |
String defining the tanh loss function in Julia code.
The function caters for different variants of "on-the-fly" simulation.
Specifically, a sampler can be provided to continuously sample new
parameter vectors from the prior, and a simulator can be provided to
continuously simulate new data conditional on the parameters. If provided
with specific sets of parameters (theta_train and theta_val)
and/or data (Z_train and Z_val), they will be held fixed during
training.
Note that using R functions to perform "on-the-fly" simulation requires the user to have installed the Julia package RCall.
train( estimator, sampler = NULL, simulator = NULL, theta_train = NULL, theta_val = NULL, Z_train = NULL, Z_val = NULL, K = 10000, m = NULL, sampler_args = NULL, sampler_kwargs = NULL, simulator_args = NULL, simulator_kwargs = NULL, loss = "absolute-error", learning_rate = 5e-04, epochs = 100, batchsize = 32, savepath = NULL, stopping_epochs = 5, epochs_per_Z_refresh = 1, epochs_per_theta_refresh = 1, simulate_just_in_time = FALSE, use_gpu = TRUE, verbose = TRUE )train( estimator, sampler = NULL, simulator = NULL, theta_train = NULL, theta_val = NULL, Z_train = NULL, Z_val = NULL, K = 10000, m = NULL, sampler_args = NULL, sampler_kwargs = NULL, simulator_args = NULL, simulator_kwargs = NULL, loss = "absolute-error", learning_rate = 5e-04, epochs = 100, batchsize = 32, savepath = NULL, stopping_epochs = 5, epochs_per_Z_refresh = 1, epochs_per_theta_refresh = 1, simulate_just_in_time = FALSE, use_gpu = TRUE, verbose = TRUE )
estimator |
a neural estimator |
sampler |
a function that takes an integer |
simulator |
a function that takes a px |
theta_train |
a set of parameters used for updating the estimator using stochastic gradient descent |
theta_val |
a set of parameters used for monitoring the performance of the estimator during training |
Z_train |
a simulated data set used for updating the estimator using stochastic gradient descent |
Z_val |
a simulated data set used for monitoring the performance of the estimator during training |
K |
the number of parameter vectors sampled in the training set at each epoch; the size of the validation set is set to |
m |
deprecated; use |
sampler_args |
a list of positional arguments passed to the parameter sampler; if provided, |
sampler_kwargs |
a list of positional arguments passed to the parameter sampler; if provided, |
simulator_args |
a list of positional arguments passed to the data simulator; if provided, |
simulator_kwargs |
a list of positional arguments passed to the data simulator; if provided, |
loss |
the loss function: a string ('absolute-error' for mean-absolute-error loss or 'squared-error' for mean-squared-error loss), or a string of Julia code defining the loss function. For some classes of estimators (e.g., |
learning_rate |
the initial learning rate for the optimiser ADAM (default 5e-4) |
epochs |
the number of epochs to train the neural network. An epoch is one complete pass through the entire training data set when doing stochastic gradient descent. |
batchsize |
the batchsize to use when performing stochastic gradient descent, that is, the number of training samples processed between each update of the neural-network parameters. |
savepath |
path to save the trained estimator and other information; if null (default), nothing is saved. Otherwise, the neural-network parameters (i.e., the weights and biases) will be saved during training as |
stopping_epochs |
cease training if the risk doesn't improve in this number of epochs (default 5). |
epochs_per_Z_refresh |
integer indicating how often to refresh the training data |
epochs_per_theta_refresh |
integer indicating how often to refresh the training parameters; must be a multiple of |
simulate_just_in_time |
flag indicating whether we should simulate "just-in-time", in the sense that only a |
use_gpu |
a boolean indicating whether to use the GPU if one is available |
verbose |
a boolean indicating whether information, including empirical risk values and timings, should be printed to the console during training. |
a trained neural estimator or, if m is a vector, a list of trained neural estimators
the Julia version of train(), assess() for assessing an estimator post training, and estimate()/sampleposterior() for making inference with observed data
## Not run: # Construct a neural Bayes estimator for replicated univariate Gaussian # data with unknown mean and standard deviation. # Load R and Julia packages library("NeuralEstimators") library("JuliaConnectoR") juliaEval("using NeuralEstimators, Flux") # Define the neural-network architecture estimator <- juliaEval(' n = 1 # dimension of each replicate d = 2 # number of parameters in the model w = 32 # width of each hidden layer psi = Chain(Dense(n, w, relu), Dense(w, w, relu)) phi = Chain(Dense(w, w, relu), Dense(w, d)) deepset = DeepSet(psi, phi) estimator = PointEstimator(deepset) ') # Sampler from the prior sampler <- function(K) { mu <- rnorm(K) # Gaussian prior for the mean sigma <- rgamma(K, 1) # Gamma prior for the standard deviation theta <- matrix(c(mu, sigma), byrow = TRUE, ncol = K) return(theta) } # Data simulator simulator <- function(theta_set, m) { apply(theta_set, 2, function(theta) { t(rnorm(m, theta[1], theta[2])) }, simplify = FALSE) } # Train using fixed parameter and data sets theta_train <- sampler(10000) theta_val <- sampler(2000) m <- 30 # number of iid replicates Z_train <- simulator(theta_train, m) Z_val <- simulator(theta_val, m) estimator <- train(estimator, theta_train = theta_train, theta_val = theta_val, Z_train = Z_train, Z_val = Z_val) ##### Simulation on-the-fly using R functions #### juliaEval("using RCall") # requires the Julia package RCall estimator <- train(estimator, sampler = sampler, simulator = simulator, m = m) ##### Simulation on-the-fly using Julia functions #### # Defining the sampler and simulator in Julia can improve computational # efficiency by avoiding the overhead of communicating between R and Julia. juliaEval("using Distributions") # Parameter sampler sampler <- juliaEval(" function sampler(K) mu = rand(Normal(0, 1), K) sigma = rand(Gamma(1), K) theta = hcat(mu, sigma)' return theta end") # Data simulator simulator <- juliaEval(" function simulator(theta_matrix, m) Z = [rand(Normal(theta[1], theta[2]), 1, m) for theta in eachcol(theta_matrix)] return Z end") # Train estimator <- train(estimator, sampler = sampler, simulator = simulator, m = m) ## End(Not run)## Not run: # Construct a neural Bayes estimator for replicated univariate Gaussian # data with unknown mean and standard deviation. # Load R and Julia packages library("NeuralEstimators") library("JuliaConnectoR") juliaEval("using NeuralEstimators, Flux") # Define the neural-network architecture estimator <- juliaEval(' n = 1 # dimension of each replicate d = 2 # number of parameters in the model w = 32 # width of each hidden layer psi = Chain(Dense(n, w, relu), Dense(w, w, relu)) phi = Chain(Dense(w, w, relu), Dense(w, d)) deepset = DeepSet(psi, phi) estimator = PointEstimator(deepset) ') # Sampler from the prior sampler <- function(K) { mu <- rnorm(K) # Gaussian prior for the mean sigma <- rgamma(K, 1) # Gamma prior for the standard deviation theta <- matrix(c(mu, sigma), byrow = TRUE, ncol = K) return(theta) } # Data simulator simulator <- function(theta_set, m) { apply(theta_set, 2, function(theta) { t(rnorm(m, theta[1], theta[2])) }, simplify = FALSE) } # Train using fixed parameter and data sets theta_train <- sampler(10000) theta_val <- sampler(2000) m <- 30 # number of iid replicates Z_train <- simulator(theta_train, m) Z_val <- simulator(theta_val, m) estimator <- train(estimator, theta_train = theta_train, theta_val = theta_val, Z_train = Z_train, Z_val = Z_val) ##### Simulation on-the-fly using R functions #### juliaEval("using RCall") # requires the Julia package RCall estimator <- train(estimator, sampler = sampler, simulator = simulator, m = m) ##### Simulation on-the-fly using Julia functions #### # Defining the sampler and simulator in Julia can improve computational # efficiency by avoiding the overhead of communicating between R and Julia. juliaEval("using Distributions") # Parameter sampler sampler <- juliaEval(" function sampler(K) mu = rand(Normal(0, 1), K) sigma = rand(Gamma(1), K) theta = hcat(mu, sigma)' return theta end") # Data simulator simulator <- juliaEval(" function simulator(theta_matrix, m) Z = [rand(Normal(theta[1], theta[2]), 1, m) for theta in eachcol(theta_matrix)] return Z end") # Train estimator <- train(estimator, sampler = sampler, simulator = simulator, m = m) ## End(Not run)