Note

Go to the end to download the full example code or to run this example in your browser via Binder

Diagnostics for MCMC methods

This notebook illustrates the use of a few diagnostics for MCMC. The example consists in learning the parameters of a regression model via Bayesian inference.

Import the necessary libraries.

from UQpy.sampling import MetropolisHastings
from UQpy.distributions import Normal, JointIndependent
import numpy as np
import matplotlib.pyplot as plt

Example setup

First generate data then define the target log pdf = log likelihood \(p(data \vert x)\).

param_true = [1., 2.]
domain = np.linspace(0, 10, 50)
data_clean = param_true[0] * domain + param_true[1] * domain ** 2
rstate = np.random.RandomState(123)
data_noisy = data_clean + rstate.randn(*data_clean.shape)

from scipy.stats import norm


def log_target(x, data, x_domain):
    log_target_value = np.zeros(x.shape[0])
    for i, xx in enumerate(x):
        h_xx = xx[0] * x_domain + xx[1] * x_domain ** 2
        log_target_value[i] = np.sum([norm.logpdf(hxi - datai) for hxi, datai in zip(h_xx, data)])
    return log_target_value


proposal = JointIndependent([Normal(scale=0.1), Normal(scale=0.05)])

sampler = MetropolisHastings(nsamples=500, dimension=2, log_pdf_target=log_target,
                             burn_length=10, jump=10, n_chains=1,
                             args_target=(data_noisy, domain), proposal=proposal)

print(sampler.samples.shape)
samples = sampler.samples

plt.plot(samples[:, 0], samples[:, 1], 'o', alpha=0.5)
plt.plot(1., 2., marker='x', color='orange')
plt.show()

Look at acceptance rate of the chain(s)

print(sampler.acceptance_rate)

This acceptance rate is quite low, showing that the proposal should probably be refined !

Graphics

fig, ax = plt.subplots(nrows=2, ncols=3, figsize=(15, 7))
for j in range(samples.shape[1]):
    ax[j, 0].plot(np.arange(samples.shape[0]), samples[:, j])
    ax[j, 0].set_title('chain - parameter # {}'.format(j + 1))
    ax[j, 1].plot(np.arange(samples.shape[0]), np.cumsum(samples[:, j]) / np.arange(samples.shape[0]))
    ax[j, 1].set_title('parameter convergence')
    ax[j, 2].acorr(samples[:, j] - np.mean(samples[:, j]), maxlags=40, normed=True)
    ax[j, 2].set_title('correlation between samples')
plt.show()

Compute effective sample size (ESS)

In MCMC, the \(ESS\) has been used to derive termination rules, based on the quality of the estimation. In brief, the simulation stops when the computational uncertainty on a chosen quantity (for instance, the expected value of RV \(x\)) is small compared to its posterior uncertainty [1]. Mathematically, this allows computation of the \(ESS\) and a minimum value \(ESS_{min}\) (qualitatively, an acceptable approximation is reached if \(ESS > ESS_{min}\)). These quantities can be computed by looking at each marginal density (is \(x\) is a multivariate random variable), or by looking at the joint. The reader is referred to [1, 2] for more details.

[1] A practical sequential stopping rule for high-dimensional MCMC and its application to spatial-temporal Bayesian models, Gong and Flegal, 2014.

[2] Multivariate Output Analysis for Markov Chain Monte Carlo, Vats and Flegal, 2015

def compute_marginal_ESS(samples, eps_ESS=0.05, alpha_ESS=0.05):
    # Computation of ESS and min ESS, look at each marginal distribution separately
    nsamples, dim = samples.shape

    bn = np.ceil(nsamples ** (1 / 2))  # nb of samples per bin
    an = int(np.ceil(nsamples / bn))  # nb of bins
    idx = np.array_split(np.arange(nsamples), an)

    means_subdivisions = np.empty((an, samples.shape[1]))
    for i, idx_i in enumerate(idx):
        x_sub = samples[idx_i, :]
        means_subdivisions[i, :] = np.mean(x_sub, axis=0)
    Omega = np.cov(samples.T)
    Sigma = np.cov(means_subdivisions.T)

    marginal_ESS = np.empty((dim,))
    min_marginal_ESS = np.empty((dim,))
    for j in range(dim):
        marginal_ESS[j] = nsamples * Omega[j, j] / Sigma[j, j]
        min_marginal_ESS[j] = 4 * norm.ppf(alpha_ESS / 2) ** 2 / eps_ESS ** 2

    return marginal_ESS, min_marginal_ESS


def compute_joint_ESS(samples, eps_ESS=0.05, alpha_ESS=0.05):
    # Computation of ESS and min ESS, look at the joint distribution separately
    nsamples, dim = samples.shape

    bn = np.ceil(nsamples ** (1 / 2))  # nb of samples per bin
    an = int(np.ceil(nsamples / bn))  # nb of bins
    idx = np.array_split(np.arange(nsamples), an)

    means_subdivisions = np.empty((an, samples.shape[1]))
    for i, idx_i in enumerate(idx):
        x_sub = samples[idx_i, :]
        means_subdivisions[i, :] = np.mean(x_sub, axis=0)
    Omega = np.cov(samples.T)
    Sigma = np.cov(means_subdivisions.T)

    from scipy.stats import chi2
    from scipy.special import gamma
    joint_ESS = nsamples * np.linalg.det(Omega) ** (1 / dim) / np.linalg.det(Sigma) ** (1 / dim)
    chi2_value = chi2.ppf(1 - alpha_ESS, df=dim)
    min_joint_ESS = 2 ** (2 / dim) * np.pi / (dim * gamma(dim / 2)) ** (
            2 / dim) * chi2_value / eps_ESS ** 2

    return joint_ESS, min_joint_ESS


marginal_ESS, min_marginal_ESS = compute_marginal_ESS(samples, eps_ESS=0.05, alpha_ESS=0.05)

print('Univariate Effective Sample Size in each dimension:')
for j in range(2):
    print('Dimension {}: ESS = {}, minimum ESS recommended = {}'.
          format(j + 1, marginal_ESS[j], min_marginal_ESS[j]))

    joint_ESS, min_joint_ESS = compute_joint_ESS(samples, eps_ESS=0.05, alpha_ESS=0.05)

    print('\nMultivariate Effective Sample Size:')
    print('Multivariate ESS = {}, minimum ESS recommended = {}'.format(joint_ESS, min_joint_ESS))

Results above show that the chains should probably be run for a longer amount of time in order to obtain reliable results in terms of the expected value of the parameters to be learnt.

Total running time of the script: ( 0 minutes 0.000 seconds)

Gallery generated by Sphinx-Gallery