Fitting a Line¶

In this notebook, we reproduce the classic “Fitting a Model to Data” example from emcee, https://emcee.readthedocs.io/en/latest/tutorials/line/, fitting a line with uncertainties, using both emcee and approxposterior. This notebook demonstrates how approxposterior can be used to perform accurate Bayesian inference of model parameters given data with uncertainties.

[1]:

import numpy as np
import matplotlib.pyplot as plt
import emcee
from scipy.optimize import minimize
import corner
import george

from approxposterior import approx, gpUtils as gpu

Data

First, we generate data according to \(y_i = m * x_i + b + \epsilon_i\) where \(\epsilon_i\) are indepedent, Gaussian errors, for each measurement \(i\). Then we’ll plot the data to see what it looks like, with the grey line being the true, underlying model.

[2]:

# Set seed for reproducibility.
seed = 42
np.random.seed(seed)

# Choose the "true" parameters.
mTrue = -0.9594
bTrue = 4.294

# Generate some synthetic data from the model.
N = 50
x = np.sort(10*np.random.rand(N))
obserr = 0.5 # Amplitude of noise term
obs = mTrue * x + bTrue # True model
obs += obserr * np.random.randn(N) # Add some random noise

# Now plot it to see what the data looks like
fig, ax = plt.subplots(figsize=(8,8))

ax.errorbar(x, obs, yerr=obserr, fmt=".k", capsize=0)
x0 = np.linspace(0, 10, 500)
ax.plot(x0, mTrue*x0+bTrue, "k", alpha=0.3, lw=3)
ax.set_xlim(0, 10)
ax.set_xlabel("x")
ax.set_ylabel("obs");

../_images/notebooks_fittingALine_3_0.png

Inference

Now we want to infer posterior probability distributions for our linear model’s parameters, \(\theta\), i.e slope and intercept, given the data and uncertainties, D, via Bayes’ Theorem: \(p(\theta | D) \propto l(D | \theta)p(\theta)\) where \(l(D|\theta)\) is the likelihood of the data for a given set of model parameters, and \(p(\theta)\) is our assume prior probability of a given \(\theta\). We sample the posterior distribution using the emcee MCMC code. See the emcee example (https://emcee.readthedocs.io/en/latest/tutorials/line/) for more details of this procedure.

[3]:

# Define the loglikelihood function
def logLikelihood(theta, x, obs, obserr):

    # Model parameters
    theta = np.array(theta)
    m, b = theta

    # Model predictions given parameters
    model = m * x + b

    # Likelihood of data given model parameters
    return -0.5*np.sum((obs-model)**2/obserr**2)

[4]:

# Define the logprior function
def logPrior(theta):

    # Model parameters
    theta = np.array(theta)
    m, b = theta

    # Probability of model parameters: flat prior
    if -5.0 < m < 0.5 and 0.0 < b < 10.0:
        return 0.0
    return -np.inf

[5]:

# Define logprobability function: l(D|theta) * p(theta)
# Note: use this for emcee, not approxposterior!
def logProbability(theta, x, obs, obserr):

    lp = logPrior(theta)
    if not np.isfinite(lp):
        return -np.inf
    return lp + logLikelihood(theta, x, obs, obserr)

Now that we’ve set up the required functions, we initialize our MCMC sampler in emcee, pick an initial state for the walkers, and run the MCMC!

[6]:

# Randomly initialize walkers
p0 = np.random.randn(32, 2)
nwalkers, ndim = p0.shape

# Set up MCMC sample object - give it the logprobability function
sampler = emcee.EnsembleSampler(nwalkers, ndim, logProbability, args=(x, obs, obserr))

# Run the MCMC for 5000 iteratios
sampler.run_mcmc(p0, 5000);

/opt/anaconda3/lib/python3.7/site-packages/emcee/moves/red_blue.py:99: RuntimeWarning: invalid value encountered in double_scalars
  lnpdiff = f + nlp - state.log_prob[j]

The MCMC is complete, so let’s examine the joint and marginal posterior probability distributions it derived for the model parameters.

[7]:

fig = corner.corner(sampler.flatchain,
                    quantiles=[0.16, 0.5, 0.84],
                    truths=[mTrue, bTrue],
                    labels=["m", "b"], range=([-1.1,-0.8],[3.7, 4.8]),
                    plot_contours=True, show_titles=True);

../_images/notebooks_fittingALine_11_0.png

Looks good!

Inference with ``approxposterior``

Now let’s see if we can derive similar constraints using approxposterior.

First, approxposterior requires a function that samples model parameters from the prior distributions.

[8]:

def sampleFunction(n):
    """
    docs

    Parameters
    ----------
    n : int
        Number of samples

    Returns
    -------
    sample : floats
        n x 3 array of floats samples from the prior
    """

    # Sample model parameters given prior distributions
    m = np.random.uniform(low=-5, high=0.5, size=(n))
    b = np.random.uniform(low=0, high=10, size=(n))

    return np.array([m,b]).T
# end function

Define the approxposterior parameters.

[9]:

# Define algorithm parameters
m0 = 20                           # Initial size of training set
m = 10                            # Number of new points to find each iteration
nmax = 5                          # Maximum number of iterations
bounds = [(-5,0.5),(0.0,10.0)]    # Prior bounds
algorithm = "bape"                # Use the Kandasamy et al. (2017) formalism

# emcee MCMC parameters: Use the same MCMC parameters as the emcee-only analysis
samplerKwargs = {"nwalkers" : 32}  # emcee.EnsembleSampler parameters
mcmcKwargs = {"iterations" : 5000} # emcee.EnsembleSampler.run_mcmc parameters

# Data and uncertainties that we use to condition our model
args = (x, obs, obserr)

Here we create the initial training set by running the true forward model \(m_0\) times. approxposterior learns on this training set and, each iteration, runs the forward model \(m\) additional times in regions of parameter space that will most improve its owns predictive performance, iteratively improving the posterior distribution estimate.

[10]:

# Create a training set to condition the GP

# Randomly sample initial conditions from the prior
theta = np.array(sampleFunction(m0))

# Evaluate forward model to compute log likelihood + lnprior for each theta
y = list()
for ii in range(len(theta)):
    y.append(logLikelihood(theta[ii], *args) + logPrior(theta[ii]))
y = np.array(y)

# We'll create the initial GP using approxposterior's built-in default
# initialization.  This default typically works well in many applications.
gp = gpu.defaultGP(theta, y)

Now initialize the ApproxPosterior object and we’re ready to go!

[11]:

ap = approx.ApproxPosterior(theta=theta,                   # Initial model parameters for inputs
                            y=y,                           # Logprobability of each input
                            gp=gp,                         # Initialize Gaussian Process
                            lnprior=logPrior,              # logprior function
                            lnlike=logLikelihood,          # loglikelihood function
                            priorSample=sampleFunction,    # Prior sample function
                            algorithm=algorithm,           # bape, agp, or alternate
                            bounds=bounds)                 # Parameter bounds

Run approxposterior! Note that we set cache to False so approxposterior only saves the most recent sampler and MCMC chain instead of saving each full MCMC chain to a local HD5f file (see emcee v3 documentation for more details on how emcee caches data).

[12]:

# Run!
ap.run(m=m, nmax=nmax,estBurnin=True, mcmcKwargs=mcmcKwargs, cache=False,
       samplerKwargs=samplerKwargs, verbose=False, args=args, onlyLastMCMC=True,
       seed=seed)

As before, let’s plot the joint and marginal posterior probability distributions to see how approxposterior did. In addition, as red points, we’ll plot where approxposterior decided to evaluate the forward model to improve its own performance. As you’ll see below, approxposterior preferentially runs the forward model in regions of high posterior probability density - it doesn’t waste time on low likelihood regions of parameter space!

[13]:

samples = ap.sampler.get_chain(discard=ap.iburns[-1], flat=True, thin=ap.ithins[-1])

fig = corner.corner(samples, quantiles=[0.16, 0.5, 0.84], truths=[mTrue, bTrue],
                    labels=["m", "b"], show_titles=True, scale_hist=True,
                    plot_contours=True, range=([-1.1,-0.8],[3.7, 4.8]));

# Plot where forward model was evaluated
fig.axes[2].scatter(ap.theta[:,0], ap.theta[:,1], s=8, color="red", zorder=20);

../_images/notebooks_fittingALine_24_0.png

The predictions are close to the true MCMC!

[ ]: