Priors

TL;DR: If you're unsure which prior to use, just stick to the default integrated Wiener process prior IWP!

Background

We model the ODE solution $y(t)$ with a Gauss–Markov prior. More precisely, let

\[\begin{aligned} Y(t) = \left[ Y^{(0)}(t), Y^{(1)}(t), \dots Y^{(q)}(t) \right], \end{aligned}\]

be the solution to the SDE

\[\begin{aligned} \text{d} Y^{(i)}(t) &= Y^{(i+1)}(t) \ \text{d}t, \qquad i = 0, \dots, q-1, \\ \text{d} Y^{(q)}(t) &= \textcolor{#389826}{A} Y(t) \ \text{d}t + \textcolor{#4063D8}{\Gamma} \ \text{d}W(t), \\ Y(0) &\sim \textcolor{purple}{ \mathcal{N} \left( \mu_0, \Sigma_0 \right) }. \end{aligned}\]

Then $Y^{(i)}(t)$ models the $i$-th derivative of $y(t)$. In this section, we consider choices relating to the drift matrix $\textcolor{#389826}{A}$. If you're more interested in the diffusion $\textcolor{#4063D8}{\Gamma}$ check out the Diffusion models and calibration section, and for info on the initial distribution $\textcolor{purple}{ \mathcal{N} \left( \mu_0, \Sigma_0 \right) }$ check out the Initialization section.

Info

If you're unsure which prior to use, just stick to the integrated Wiener process prior IWP! This is also the default choice for all solvers. The other priors are rather experimental / niche at the time of writing.

Prior choices

ProbNumDiffEq.jl currently supports three classes of priors for the solvers:

  • IWP: $q$-times integrated Wiener processes
  • IOUP: $q$-times integrated Ornstein–Uhlenbeck processes
  • Matern: Matérn processes

Let's look at each of them in turn and visualize some examples.

Integrated Wiener process (IWP)

ProbNumDiffEq.IWPType
IWP([dim::Integer=1,] num_derivatives::Integer)

$q$-times integrated Wiener process.

This is the recommended prior! The IWP is the most common prior choice in the probabilistic ODE solver literature (see the references) and is the default choice for the solvers in ProbNumDiffEq.jl. It is also the prior that has the most efficient implementation.

The IWP can be created without specifying the dimension of the Wiener process, in which case it will be one-dimensional. The ODE solver then assumes that each dimension of the ODE solution should be modeled with the same prior. This is typically the preferred usage.

In math

The IWP is a Gauss–Markov process, which we model with a state representation

\[\begin{aligned} Y(t) = \left[ Y^{(0)}(t), Y^{(1)}(t), \dots Y^{(q)}(t) \right], \end{aligned}\]

defined as the solution of the stochastic differential equation

\[\begin{aligned} \text{d} Y^{(i)}(t) &= Y^{(i+1)}(t) \ \text{d}t, \qquad i = 0, \dots, q-1 \\ \text{d} Y^{(q)}(t) &= \Gamma \ \text{d}W(t). \end{aligned}\]

Then, $Y^{(0)}(t)$ is the $q$-times integrated Wiener process (IWP) and $Y^{(i)}(t)$ is the $i$-th derivative of the IWP, for $i = 1, \dots, q$.

Examples

julia> solve(prob, EK1(prior=IWP(2)))
source

Here is how the IWP looks for varying smoothness parameters $q$:

using ProbNumDiffEq, Plots
plotrange = range(0, 10, length=250)
plot(
    plot(IWP(1), plotrange; title="q=1"),
    plot(IWP(2), plotrange; title="q=2"),
    plot(IWP(3), plotrange; title="q=3"),
    plot(IWP(4), plotrange; title="q=4");
    ylims=(-20,20),
)
Example block output

In the context of ODE solvers, the smoothness parameter $q$ influences the convergence rate of the solver, and so it is typically chose similarly to the order of a Runge–Kutta method: lower order for low accuracy, higher order for high accuracy.

Integrated Ornstein–Uhlenbeck process (IOUP)

The $q$-times integrated Ornstein–Uhlenbeck process prior IOUP is a generalization of the IWP prior, where the drift matrix $\textcolor{#389826}{A}$ is not zero:

ProbNumDiffEq.IOUPType
IOUP([dim::Integer=1,]
     num_derivatives::Integer,
     rate_parameter::Union{Number,AbstractVector,AbstractMatrix})

Integrated Ornstein–Uhlenbeck process.

This prior is mostly used in the context of Probabilistic Exponential Integrators to include the linear part of a semi-linear ODE in the prior, so it is used in the ExpEK and the RosenbrockExpEK.

In math

The IOUP is a Gauss–Markov process, which we model with a state representation

\[\begin{aligned} Y(t) = \left[ Y^{(0)}(t), Y^{(1)}(t), \dots Y^{(q)}(t) \right], \end{aligned}\]

defined as the solution of the stochastic differential equation

\[\begin{aligned} \text{d} Y^{(i)}(t) &= Y^{(i+1)}(t) \ \text{d}t, \qquad i = 0, \dots, q-1 \\ \text{d} Y^{(q)}(t) &= L Y^{(q)}(t) \ \text{d}t + \Gamma \ \text{d}W(t), \end{aligned}\]

where $L$ is called the rate_parameter. Then, $Y^{(0)}(t)$ is the $q$-times integrated Ornstein–Uhlenbeck process (IOUP) and $Y^{(i)}(t)$ is the $i$-th derivative of the IOUP, for $i = 1, \dots, q$.

Examples

julia> solve(prob, EK1(prior=IOUP(2, -1)))
source

Here is how the IOUP looks for varying rate parameters:

using ProbNumDiffEq, Plots
plotrange = range(0, 10, length=250)
plot(
    plot(IOUP(1, -1), plotrange; title="q=1,L=-1", ylims=(-20,20)),
    plot(IOUP(1, 1), plotrange; title="q=1,L=1", ylims=(-20,20)),
    plot(IOUP(4, -1), plotrange; title="q=4,L=-1", ylims=(-50,50)),
    plot(IOUP(4, 1), plotrange; title="q=4,L=1", ylims=(-50,50));
)
Example block output

In the context of Probabilistic Exponential Integrators, the rate parameter is often chosen according to the given ODE. Here is an example for a damped oscillator:

plot(IOUP(dim=2, num_derivatives=1, rate_parameter=[-0.2 -2π; 2π -0.2]),
     plotrange; plot_title="damped oscillator prior")
Example block output

Matérn process (Matern)

ProbNumDiffEq.MaternType
Matern([dim::Integer=1,]
       num_derivatives::Integer,
       lengthscale::Number)

Matern process.

The class of Matern processes is well-known in the Gaussian process literature, and they also have a corresponding SDE representation similarly to the IWP and the IOUP. See also [6] for more details.

In math

A Matern process is a Gauss–Markov process, which we model with a state representation

\[\begin{aligned} Y(t) = \left[ Y^{(0)}(t), Y^{(1)}(t), \dots Y^{(q)}(t) \right], \end{aligned}\]

defined as the solution of the stochastic differential equation

\[\begin{aligned} \text{d} Y^{(i)}(t) &= Y^{(i+1)}(t) \ \text{d}t, \qquad i = 0, \dots, q-1 \\ \text{d} Y^{(q)}(t) &= - \sum_{j=0}^q \left( \begin{pmatrix} q+1 \\ j \end{pmatrix} \left( \frac{\sqrt{2q - 1}}{l} \right)^{q-j} Y^{(j)}(t) \right) \ \text{d}t + \Gamma \ \text{d}W(t). \end{aligned}\]

where $l$ is called the lengthscale parameter. Then, $Y^{(0)}(t)$ is a Matern process and $Y^{(i)}(t)$ is the $i$-th derivative of this process, for $i = 1, \dots, q$.

Examples

julia> solve(prob, EK1(prior=Matern(2, 1)))
source

Here is how the Matern looks for varying smoothness parameters $q$:

using ProbNumDiffEq, Plots
plotrange = range(0, 10, length=250)
plot(
    plot(Matern(1, 1), plotrange; title="q=1"),
    plot(Matern(2, 1), plotrange; title="q=2"),
    plot(Matern(3, 1), plotrange; title="q=3"),
    plot(Matern(4, 1), plotrange; title="q=4");
)
Example block output

and for varying length scales $\ell$:

plot(
    plot(Matern(2, 5), plotrange; title="l=5"),
    plot(Matern(2, 2), plotrange; title="l=2"),
    plot(Matern(2, 1), plotrange; title="l=1"),
    plot(Matern(2, 0.5), plotrange; title="l=0.5"),
)
Example block output

API

ProbNumDiffEq.AbstractGaussMarkovProcessType
AbstractGaussMarkovProcess{elType}

Abstract type for Gauss-Markov processes.

Gauss-Markov processes are solutions to linear time-invariant stochastic differential equations (SDEs). Here we assume SDEs of the form

\[\begin{aligned} dX_t &= F X_t dt + L dW_t \\ X_0 &= \mathcal{N} \left( X_0; \mu_0, \Sigma_0 \right) \end{aligned}\]

where $X_t$ is the state, $W_t$ is a Wiener process, and $F$ and $L$ are matrices.

Currently, ProbNumDiffEq.jl makes many assumptions about the structure of the SDEs that it can solve. In particular, it assumes that the state vector $X_t$ contains a range of dervatives, and that the Wiener process only enters the highest one. It also assumes a certain ordering of dimensions and derivatives. This is not a limitation of the underlying mathematics, but rather a limitation of the current implementation. In the future, we hope to remove these limitations.

Warning

We currently strongly recommended to not implement your own Gauss-Markov process by subtyping this type! The interface is not yet stable, and the implementation is not yet sufficiently documented. Proceed at your own risk.

source
ProbNumDiffEq.LTISDEType
LTISDE(F::AbstractMatrix, L::AbstractMatrix)

Linear time-invariant stochastic differential equation.

A LTI-SDE is a stochastic differential equation of the form

\[dX_t = F X_t dt + L dW_t\]

where $X_t$ is the state, $W_t$ is a Wiener process, and $F$ and $L$ are matrices. This LTISDE object holds the matrices $F$ and $L$. It also provides some functionality to discretize the SDE via a matrix-fraction decomposition. See: discretize(::LTISDE, ::Real).

source
ProbNumDiffEq.dimFunction
dim(p::AbstractGaussMarkovProcess)

Return the dimension of the process.

This is not the dimension of the "state" that is used to efficiently model the prior process as a state-space model, but it is the dimension of the process itself that we aim to model.

See AbstractGaussMarkovProcess for more details on Gauss-Markov processes in ProbNumDiffEq.

source
ProbNumDiffEq.to_sdeFunction
to_sde(p::AbstractGaussMarkovProcess)

Convert the prior to the corresponding SDE.

Gauss-Markov processes are solutions to linear time-invariant stochastic differential equations (SDEs) of the form

\[\begin{aligned} dX_t &= F X_t dt + L dW_t \\ X_0 &= \mathcal{N} \left( X_0; \mu_0, \Sigma_0 \right) \end{aligned}\]

where $X_t$ is the state, $W_t$ is a Wiener process, and $F$ and $L$ are matrices. This function returns the corresponding SDE, i.e. the matrices $F$ and $L$, as a LTISDE.

source
ProbNumDiffEq.discretizeFunction
discretize(p::AbstractGaussMarkovProcess, step_size::Real)

Compute the transition matrices of the process for a given step size.

source
discretize(p::LTISDE, step_size::Real)

Compute the transition matrices of the SDE solution for a given step size.

source
ProbNumDiffEq.initial_distributionFunction
initial_distribution(p::AbstractGaussMarkovProcess)

Return the initial distribution of the process.

Currently this is always a Gaussian distribution with zero mean and unit variance, unless explicitly overwitten (e.g. for Matern processes to have the stationary distribution). This implementation is likely to change in the future to allow for more flexibility.

source

Convenience functions to analyze and visualize priors

ProbNumDiffEq.marginalizeFunction
marginalize(process::AbstractGaussMarkovProcess, times)

Compute the marginal distributions of the process at the given time points.

This function computes the marginal distributions of the process at the given times. It does so by discretizing the process with the given step sizes (using ProbNumDiffEq.discretize), and then computing the marginal distributions of the resulting Gaussian distributions.

See also: sample.

source