{ "cells": [ { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "view-in-github" }, "source": [ "\"Open" ] }, { "cell_type": "code", "execution_count": 84, "metadata": { "id": "XSCBJ_S4XvDe" }, "outputs": [], "source": [ "%%capture\n", "if 'google.colab' in str(get_ipython()):\n", " # Install packages\n", " %pip install harmonic emcee getdist" ] }, { "cell_type": "code", "execution_count": 85, "metadata": { "id": "1OtVkBf7OAIF" }, "outputs": [], "source": [ "%%capture\n", "# Import modules\n", "import numpy as np\n", "import matplotlib.pyplot as plt\n", "import harmonic as hm\n", "import emcee\n", "import jax.numpy as jnp\n", "import jax\n", "import scipy.special as sp\n", "\n", "%config InlineBackend.figure_format = 'retina' # for nicer plots" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Part 1. Basic usage" ] }, { "cell_type": "markdown", "metadata": { "id": "3bTCRZLeXULH" }, "source": [ "### Define Bayesian posterior function\n", "---\n", "\n", "Now we will need to define the log-posterior function of interest. \n", "\n", "As a working example for this basic tutorial we consider a log-likelihood given a standard 2-dimensional Gaussian\n", "\n", "$$\n", "f(x) = -\\frac{1}{2}x^{T}\\Sigma^{-1}x\n", "$$\n", "\n", "where for simplicity we have taken the mean $\\mu=0$ and dropped scaling factors, and assume a box prior over an finite symetrical interval of width $a$. Under such conditions the log-posterior is given by" ] }, { "cell_type": "code", "execution_count": 86, "metadata": { "id": "8PgO6f4VQSpD" }, "outputs": [], "source": [ "def ln_prior_uniform(x, prior_bounds):\n", " \"\"\"\n", " Compute log_e prior in the given bounds, returning -inf outside the bounds.\n", "\n", " Args:\n", " x: Position at which to evaluate prior.\n", " prior_bounds: Array of shape (d, 2) specifying lower and upper bounds for each dimension.\n", " Returns:\n", " double or array: Scalar of log prior density of x if x is 1D, else array of log prior densities.\n", " \"\"\"\n", " assert prior_bounds.ndim == 2 and prior_bounds.shape[1] == 2, (\n", " \"prior_bounds must be of shape (n, 2)\"\n", " )\n", " lower, upper = prior_bounds[:, 0], prior_bounds[:, 1] \n", " inside = jnp.all((x >= lower ) & (x <= upper), axis=-1)\n", " log_volume = jnp.log(jnp.prod(upper - lower))\n", " ln_prior = jnp.where(inside, -log_volume, -jnp.inf)\n", " return ln_prior\n", "\n", "\n", "def ln_likelihood_gaussian(x, inv_cov):\n", " \"\"\"\n", " Compute log_e of n dimensional multivariate Gaussian likelihood.\n", " Args:\n", " x: Position(s) at which to evaluate likelihood.\n", " inv_cov: Inverse covariance matrix of the Gaussian.\n", " Returns:\n", " double or array: Value or array of log likelihoods at x.\n", " \"\"\"\n", " exponent = -0.5 * jnp.einsum(\"...i,ij,...j\", x, inv_cov, x)\n", " return exponent\n", " \n", "\n", "def ln_posterior(x, inv_cov, prior_bounds):\n", " \"\"\"Compute log_e of posterior of n dimensional multivariate Gaussian.\n", "\n", " Args:\n", "\n", " x: Position at which to evaluate posterior.\n", "\n", " Returns:\n", "\n", " double: Value of posterior at x.\n", "\n", " \"\"\"\n", " ln_prior = ln_prior_uniform(x, prior_bounds)\n", " ln_likelihood = ln_likelihood_gaussian(x, inv_cov)\n", " ln_posterior = ln_prior + ln_likelihood\n", " return ln_posterior" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's enable double precision to avoid rounding errors!" ] }, { "cell_type": "code", "execution_count": 87, "metadata": {}, "outputs": [], "source": [ "jax.config.update(\"jax_enable_x64\", True)" ] }, { "cell_type": "markdown", "metadata": { "id": "VAvoFPV8ZJ0Q" }, "source": [ "### Compute samples using `emcee`\n", "---\n", "\n", "We then sample from the posterior using an MCMC algorithm. While any MCMC approach can be used we sample using the `emcee` package.\n", "\n", "First we will need to define and initialise some variables." ] }, { "cell_type": "code", "execution_count": 88, "metadata": { "id": "us1kBuWlQZTy" }, "outputs": [], "source": [ "# Define parameters for emcee sampling\n", "ndim = 5 # Dimensions\n", "nchains = 30 # total number of chains to compute\n", "samples_per_chain = 5000 # number of samples per chain\n", "nburn = 2000 # number of samples to discard as burn in\n", "\n", "# Construct a trivial inverse covariance (identity matrix) and prior bounds\n", "inv_cov = np.zeros((ndim,ndim))\n", "diag_cov = np.ones(ndim)\n", "np.fill_diagonal(inv_cov, diag_cov)\n", "prior_bounds = np.array([[-10, 10]] * ndim)" ] }, { "cell_type": "markdown", "metadata": { "id": "_HBc2vhRZh8o" }, "source": [ "Now we need to run the sampler." ] }, { "cell_type": "code", "execution_count": 89, "metadata": { "id": "Eel3bSORQZW0" }, "outputs": [], "source": [ "# initialise random seed\n", "np.random.seed(1)\n", "\n", "# Set initial random position and state\n", "pos = np.random.rand(ndim * nchains).reshape((nchains, ndim)) \n", "rstate = np.random.get_state()\n", "\n", "# Instantiate and execute sampler \n", "sampler = emcee.EnsembleSampler(nchains, ndim, ln_posterior, args=[inv_cov, prior_bounds], vectorize=True)\n", "(pos, prob, state) = sampler.run_mcmc(pos, samples_per_chain, rstate0=rstate, progress=True) \n", "\n", "# Collect samples into contiguous numpy arrays (discarding burn in)\n", "samples = np.ascontiguousarray(sampler.chain[:,nburn:,:])\n", "lnprob = np.ascontiguousarray(sampler.lnprobability[:,nburn:])" ] }, { "cell_type": "markdown", "metadata": { "id": "hP6v_b7YZnEo" }, "source": [ "## Compute evidence using `harmonic`\n", "---\n", "\n", "`harmonic` requires only posterior samples. There are no constraints on the type of sampling algorithm used.\n", "\n", "Once we have posterior samples to hand, they can be post-processed using `harmonic` to compute the Bayesian evidence." ] }, { "cell_type": "markdown", "metadata": { "id": "6f2YXRKxZ-IR" }, "source": [ "### Collating samples using `harmonic.chains` class\n", "\n", "We first configure the chains into a `harmonic`-friendly shape, which we do as follows." ] }, { "cell_type": "code", "execution_count": 90, "metadata": { "id": "eSxxNW1KQZZc" }, "outputs": [], "source": [ "# Instantiate harmonic's chains class \n", "chains = hm.Chains(ndim)\n", "chains.add_chains_3d(samples, lnprob)" ] }, { "cell_type": "markdown", "metadata": { "id": "UsD1ymV2aK3u" }, "source": [ "Since we will subsequently learn the target distribution $\\varphi$ we split the samples into training and inference sets." ] }, { "cell_type": "code", "execution_count": 91, "metadata": { "id": "efPNgW8qQZcW" }, "outputs": [], "source": [ "# Split the chains into the ones which will be used to train the machine \n", "# learning model and for inference\n", "chains_train, chains_infer = hm.utils.split_data(chains, training_proportion=0.5)" ] }, { "cell_type": "markdown", "metadata": { "id": "Z0UhPgtWaUpb" }, "source": [ "### Train the machine learning model\n", "\n", "We simply select the model we wish to adopt and fit the model." ] }, { "cell_type": "code", "execution_count": 92, "metadata": { "id": "WrB47hA3QZfM" }, "outputs": [], "source": [ "# Select RQSpline Model\n", "temperature = 0.8\n", "model = hm.model.RQSplineModel(ndim, standardize=True, temperature=temperature)\n", "epochs_num = 15\n", "# Train model\n", "model.fit(chains_train.samples, epochs=epochs_num, verbose=True, batch_size=256)" ] }, { "cell_type": "markdown", "metadata": { "id": "-Nh_lQnzbGGv" }, "source": [ "### Posterior triangle plot\n", "\n", "Let's also plot slices of the posterior using these samples to see what we're working with! It is important for the concentrated flow (here we set temperature equal to 0.8) to be contained within the posterior. If this is not the case, the evidence estimate will not be accurate. If the flow is not contained within the posterior, try reducing the temperature or learning the posterior better by training for longer or changing the model architecture or hyperparameters." ] }, { "cell_type": "code", "execution_count": 93, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 297 }, "id": "AUOB38wrQxEu", "outputId": "1b1bc316-b9b2-4ae6-fff0-e059bf65e849" }, "outputs": [], "source": [ "samples = samples.reshape((-1, ndim))\n", "samp_num = samples.shape[0]\n", "flow_samples = model.sample(samp_num)\n", "hm.utils.plot_getdist_compare(samples, flow_samples)" ] }, { "cell_type": "markdown", "metadata": { "id": "es7_BU7_ao4q" }, "source": [ "### Compute the Bayesian evidence\n", "\n", "Finally we simply compute the learned harmonic mean estimator as follows." ] }, { "cell_type": "code", "execution_count": 94, "metadata": { "id": "yP3AjUrCQZhh" }, "outputs": [], "source": [ "# Instantiate harmonic's evidence class\n", "ev = hm.Evidence(chains_infer.nchains, model)\n", "\n", "# Pass the evidence class the inference chains and compute the evidence!\n", "ev.add_chains(chains_infer)\n", "ln_inv_evidence = ev.ln_evidence_inv\n", "err_ln_inv_evidence = ev.compute_ln_inv_evidence_errors()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We typically always work in log space to avoid issues with over/underflow especially since evidence values can get quite extreme!" ] }, { "cell_type": "markdown", "metadata": { "id": "7i4yV2Yyat0x" }, "source": [ "## Results\n", "---\n", "\n", "Let's check the evidence value computed." ] }, { "cell_type": "markdown", "metadata": { "id": "_37pQ-HWa1Cf" }, "source": [ "### Analytic value\n", "\n", "As this is a standard n-dimensional Gaussian the evidence is analytically tractable and given by" ] }, { "cell_type": "code", "execution_count": 95, "metadata": { "id": "YMQ9m747TAOT" }, "outputs": [], "source": [ "def ln_evidence_analytic_gaussian_box(inv_cov, prior_bounds):\n", " \"\"\"\n", " Computes the analytic log evidence for a box prior, assuming that the prior\n", " is much wider than the Gaussian likelihood.\n", "\n", " Args:\n", " inv_cov: Inverse covariance matrix of the Gaussian likelihood.\n", " prior_bounds: Array of shape (d, 2) specifying lower and upper bounds for each dimension.\n", " Returns:\n", " double: Analytic log evidence.\n", " \"\"\"\n", " d = len(prior_bounds)\n", " ln_evidence = 0.5 * (d * np.log(2* jnp.pi) - jnp.linalg.slogdet(inv_cov)[1])\n", " for bound in prior_bounds:\n", " ln_evidence -= jnp.log(bound[1] - bound[0])\n", " return ln_evidence\n", "\n", "ln_inv_evidence_analytic = -ln_evidence_analytic_gaussian_box(inv_cov, prior_bounds)" ] }, { "cell_type": "markdown", "metadata": { "id": "0q64MMxea-oo" }, "source": [ "Let's compare the value computed by `harmonic` and analytically" ] }, { "cell_type": "code", "execution_count": 98, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "AbPq2cP6QZoR", "outputId": "5605db26-d6ec-4fb6-8c52-ab068d065e11" }, "outputs": [], "source": [ "print('ln inverse evidence (harmonic) = {} +/- {}'.format(ln_inv_evidence, err_ln_inv_evidence))\n", "print('ln inverse evidence (analytic) = {}'.format(ln_inv_evidence_analytic))\n", "print('nsigma = {}'.format(np.abs(ln_inv_evidence - ln_inv_evidence_analytic) / err_ln_inv_evidence))" ] }, { "cell_type": "markdown", "metadata": { "id": "slSx0i6mbExE" }, "source": [ "As expected, the evidence computed by `harmonic` is close to that computed analytically." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Part 2. Diagnostics" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Apart from inspecting the flow visually, we should check the quality of the estimate using diagnostic metrics provided by `harmonic`. This is an important indication of whether the error estimate of the evidence is precise.\n", "\n", "\n", "### 1. Kurtosis\n", "By the central limit theorem, for a large number of samples the distribution of inverse evidence estimated for each chain $j$, $\\hat{\\rho_{j}}$ approaches a Gaussian, with kurtosis $\\kappa = 3$. If the estimated kurtosis $\\hat{\\kappa} \\gg 3$ it would indicate that the sampled distribution of $\\hat{\\rho_{j}}$ has long tails, suggesting further samples need to be drawn, or a better contained flow should be used.\n", "\n", "### 2. Variance of Variance Check\n", "This diagnostic compares the ratio of the variance of the variance, to the variance itself (of the inverse evidence). By the central limit theorem, the ratio should be approximately:\n", "\n", "$$\\sqrt{\\frac{2}{n_{\\text{eff}} - 1}}$$\n", "\n", "where $n_{\\text{eff}}$ is the effective sample size. If the ratio is significantly larger than this value, it indicates that the variance estimate is not precise, and more samples are needed.\n", "\n", "We keep track of these measures when computing the evidence, and they are then stored as attributes of the Evidence class.\n", "\n", "Let's compute the diagnostics:" ] }, { "cell_type": "code", "execution_count": 99, "metadata": {}, "outputs": [], "source": [ "print(\"kurtosis = {}\".format(ev.kurtosis), \" Aim for ~3.\")\n", "check = np.exp(0.5 * ev.ln_evidence_inv_var_var - ev.ln_evidence_inv_var)\n", "\n", "print(\"sqrt(evidence_inv_var_var) / evidence_inv_var = {}\".format(check))\n", "print(\" Aim for sqrt( 2/(n_eff-1) ) = {}\".format(np.sqrt(2.0 / (ev.n_eff - 1))))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Diagnostics are looking good for our Gaussian! " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Part 3. Polynomial model comparison\n", "\n", "Let's now consider Bayesian model comparison of polynomial models fitted to synthetic data. \n", "\n", "We want to determine the degree that best describes the data without including unnecessary complexity." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Set random seed for reproducibility\n", "np.random.seed(42)\n", "\n", "\n", "# Helper: evaluate polynomial given coefficients in increasing power order\n", "def model_predict(params, x):\n", " \"\"\"Evaluate polynomial with coefficients `params = [a0, a1, ...]`\n", " \n", " Returns y = a0 + a1*x + a2*x^2 + ...\n", "\n", " `np.polyval` expects highest-to-lowest order, so we flip.\n", " \"\"\"\n", " y = np.polyval(np.flip(params), x)\n", " return y" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Generate synthetic data\n", "\n", "Generate a small synthetic dataset by selecting a polynomial order and\n", "adding Gaussian observation noise.\n", "\n", "Options for the true model to be ``linear``, ``quadratic``, ``cubic``, or ``quartic`` and to set the true parameters" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Choose the underlying truth model\n", "truth_model = \"cubic\" # options: 'linear', 'quadratic', 'cubic', 'quartic'\n", "\n", "# Observation settings\n", "noise_std = 2.0 # Standard deviation of the Gaussian noise\n", "num_data_points = 20\n", "x_data = np.linspace(-3, 3, num_data_points)\n", "\n", "# True polynomial coefficients (a_k corresponds to x^k)\n", "a0_true = -2.0 # constant term\n", "a1_true = -1.0 # linear term\n", "a2_true = 0.2 # quadratic term\n", "a3_true = 0.5 # cubic term\n", "a4_true = 0.0 # quartic term\n", "\n", "if truth_model == \"linear\":\n", " y_true_noiseless = model_predict([a0_true, a1_true], x_data)\n", "elif truth_model == \"quadratic\":\n", " y_true_noiseless = model_predict([a0_true, a1_true, a2_true], x_data)\n", "elif truth_model == \"cubic\":\n", " y_true_noiseless = model_predict([a0_true, a1_true, a2_true, a3_true], x_data)\n", "elif truth_model == \"quartic\":\n", " y_true_noiseless = model_predict(\n", " [a0_true, a1_true, a2_true, a3_true, a4_true], x_data\n", " )\n", "else:\n", " raise ValueError(\n", " \"Truth_model must be one of 'linear', 'quadratic', 'cubic', 'quartic'\"\n", " )\n", "\n", "# Add Gaussian noise to y values\n", "y_data = y_true_noiseless + np.random.normal(\n", " loc=0.0, scale=noise_std, size=num_data_points\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's plot the data:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "x_plot = np.linspace(x_data.min(), x_data.max(), 500)\n", "truth_curve = lambda x: model_predict([a0_true, a1_true, a2_true, a3_true], x)\n", "\n", "plt.errorbar(x_data, y_data, yerr=noise_std, fmt=\".k\", capsize=0)\n", "plt.plot(x_plot, truth_curve(x_plot), \"r--\", label=\"true (noiseless)\")\n", "plt.grid()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Log-likelihood\n", "\n", "The log-likelihood assumes independent Gaussian measurement noise with\n", "known standard deviation, and is defined as follows:\n", "\n", " $$\\hat{y}(x;a)=\\sum_{k=0}^n a_k x^k$$\n", "\n", "- Gaussian log-likelihood with known noise standard deviation $\\sigma$:\n", "\n", " $$\\ln P(y\\mid x,a) = -\\tfrac{1}{2}\\sum_i\\left(\\frac{(y_i-\\hat{y}(x_i;a))^2}{\\sigma^2} + \\ln(2\\pi\\sigma^2)\\right).$$" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def log_likelihood(params, x, y, noise_std):\n", " model_prediction = model_predict(params, x)\n", " sigma2 = noise_std**2\n", " diff = y - model_prediction\n", " log_likelihood_val = -0.5 * np.sum(diff**2 / sigma2 + np.log(2 * np.pi * sigma2))\n", " return log_likelihood_val" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Prior\n", "\n", "We use independent uniform priors for each polynomial coefficient. \n", "\n", "$$\n", "\\ln\\pi(\\theta)=\\begin{cases}\n", "-\\sum_{i=0}^{D-1}\\ln(b_i-a_i), & \\text{if } a_i\\le\\theta_i\\le b_i\\ \\forall i,\\\\\n", "-\\infty, & \\text{otherwise.}\n", "\\end{cases}\n", "$$" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def log_prior(params, bounds):\n", " \"\"\"Log-density of independent uniform priors.\n", "\n", " Args:\n", " params (array-like): parameter vector of shape (D,).\n", " bounds (sequence): sequence of (low, high) pairs length D.\n", "\n", " Returns:\n", " float: log prior (finite) when all params within bounds, otherwise -inf.\n", " \"\"\"\n", " params = np.asarray(params)\n", " bnds = np.asarray(bounds, dtype=float)\n", "\n", " lows = bnds[:, 0]\n", " highs = bnds[:, 1]\n", "\n", " widths = highs - lows\n", " return -np.sum(np.log(widths))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Model posteriors and prior bounds\n", "\n", "We define a log-posterior function for each polynomial model (linear, quadratic, cubic, quartic). \n", "\n", "Each posterior is the sum of the uniform prior and the Gaussian log-likelihood. \n", "\n", "We also set the boundaries of the uniform prior. Since in this case we don't have any strong prior information about the polynomial coefficients, we should also explore different possibilities and see if the conclusions change." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "prior_bound = 4.0\n", "param_bounds_linear = [(-prior_bound, prior_bound)] * 2\n", "param_bounds_quadratic = [(-prior_bound, prior_bound)] * 3\n", "param_bounds_cubic = [(-prior_bound, prior_bound)] * 4\n", "param_bounds_quartic = [(-prior_bound, prior_bound)] * 5\n", "\n", "\n", "# Model A: Linear (order 1)\n", "def log_posterior_linear(params):\n", " lp = log_prior(params, param_bounds_linear)\n", " if not np.isfinite(lp):\n", " return -np.inf\n", " return lp + log_likelihood(params, x_data, y_data, noise_std)\n", "\n", "\n", "# Model B: Quadratic (order 2) - The True Model\n", "def log_posterior_quadratic(params):\n", " lp = log_prior(params, param_bounds_quadratic)\n", " if not np.isfinite(lp):\n", " return -np.inf\n", " return lp + log_likelihood(params, x_data, y_data, noise_std)\n", "\n", "\n", "# Model C: Cubic (order 3)\n", "def log_posterior_cubic(params):\n", " lp = log_prior(params, param_bounds_cubic)\n", " if not np.isfinite(lp):\n", " return -np.inf\n", " return lp + log_likelihood(params, x_data, y_data, noise_std)\n", "\n", "\n", "# Model D: Quartic (order 4)\n", "def log_posterior_quartic(params):\n", " lp = log_prior(params, param_bounds_quartic)\n", " if not np.isfinite(lp):\n", " return -np.inf\n", " return lp + log_likelihood(params, x_data, y_data, noise_std)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## MCMC sampling with emcee\n", "\n", "Again, use `emcee` to obtain posterior samples." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(\"\\n--- Running MCMC with emcee ---\")\n", "\n", "# Define common emcee sampling parameters\n", "nwalkers = 32 # Number of MCMC walkers\n", "nsteps = 1000 # Number of steps per walker\n", "nburn = 500 # Number of burn-in steps to discard\n", "\n", "\n", "# Helper function to run emcee and return samples\n", "def run_mcmc_sampler(log_posterior_func, ndim, initial_guess, model_name):\n", " \"\"\"\n", " Runs the emcee sampler for a given log-posterior function.\n", "\n", " Args:\n", " log_posterior_func (callable): The log-posterior function.\n", " ndim (int): The number of dimensions (parameters) for the model.\n", " initial_guess (np.ndarray): An initial guess for the parameters.\n", " model_name (str): Name of the model for printing progress.\n", "\n", " Returns:\n", " np.ndarray: 3D array of posterior samples after burn-in (nwalkers, nsteps-nburn, ndim).\n", " np.ndarray: 2D array of log-probabilities after burn-in (nwalkers, nsteps-nburn).\n", " \"\"\"\n", " print(f\" Sampling for {model_name} model ({ndim} parameters)...\")\n", "\n", " # Initialize walkers in a small ball around the initial guess\n", " pos = initial_guess + 1e-4 * np.random.randn(nwalkers, ndim)\n", "\n", " # Instantiate the sampler\n", " sampler = emcee.EnsembleSampler(nwalkers, ndim, log_posterior_func)\n", "\n", " # Run the MCMC\n", " sampler.run_mcmc(pos, nsteps, progress=True)\n", "\n", " # Get the chain of samples and log-probabilities, discarding burn-in\n", " samples = np.ascontiguousarray(sampler.chain[:, nburn:, :])\n", " log_prob = np.ascontiguousarray(sampler.lnprobability[:, nburn:])\n", "\n", " print(\n", " f\" Finished sampling for {model_name}. Collected {samples.shape[0] * samples.shape[1]} samples.\"\n", " )\n", " return samples, log_prob\n", "\n", "\n", "# Initial guesses for parameters (can be based on true values or rough estimates)\n", "# For linear: [a_0, a_1]\n", "initial_guess_linear = np.array([a0_true, a1_true])\n", "# For quadratic: [a_0, a_1, a_2]\n", "initial_guess_quadratic = np.array([a0_true, a1_true, a2_true])\n", "# For cubic: [a_0, a_1, a_2, a_3]\n", "initial_guess_cubic = np.array([a0_true, a1_true, a2_true, a3_true])\n", "# For quartic: [a_0, a_1, a_2, a_3, a_4]\n", "initial_guess_quartic = np.array(\n", " [a0_true, a1_true, a2_true, a3_true, a4_true if a4_true != 0 else 0.01]\n", ")\n", "\n", "# Run MCMC for each model\n", "samples_linear, log_prob_linear = run_mcmc_sampler(\n", " log_posterior_linear,\n", " ndim=2,\n", " initial_guess=initial_guess_linear,\n", " model_name=\"linear\",\n", ")\n", "samples_quadratic, log_prob_quadratic = run_mcmc_sampler(\n", " log_posterior_quadratic,\n", " ndim=3,\n", " initial_guess=initial_guess_quadratic,\n", " model_name=\"quadratic\",\n", ")\n", "samples_cubic, log_prob_cubic = run_mcmc_sampler(\n", " log_posterior_cubic,\n", " ndim=4,\n", " initial_guess=initial_guess_cubic,\n", " model_name=\"cubic\",\n", ")\n", "samples_quartic, log_prob_quartic = run_mcmc_sampler(\n", " log_posterior_quartic,\n", " ndim=5,\n", " initial_guess=initial_guess_quartic,\n", " model_name=\"quartic\",\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Evidence estimation with harmonic" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(\"\\n--- Computing log-evidence with harmonic ---\")\n", "\n", "# Define common harmonic parameters\n", "training_proportion = 0.5 # Proportion of samples to use for training the flow model\n", "temperature = 0.8 # Temperature parameter for the RQSplineModel\n", "\n", "\n", "# Helper function to compute log-evidence for a given model\n", "def compute_log_evidence(samples, log_prob, ndim, model_name):\n", " \"\"\"Compute ln(Z) and an approximate error using harmonic's normalizing flow.\n", "\n", " This helper trains a flow on a split of the provided posterior samples and\n", " uses harmonic's Evidence object to obtain an estimate of the inverse\n", " evidence. The function returns ln(Z) and a symmetric error estimate.\n", " \"\"\"\n", " print(f\" Computing log-evidence for {model_name} model...\")\n", "\n", " chains = hm.Chains(ndim)\n", " chains.add_chains_3d(samples, log_prob)\n", "\n", " chains_train, chains_infer = hm.utils.split_data(\n", " chains, training_proportion=training_proportion\n", " )\n", "\n", " model = hm.model.RQSplineModel(ndim, standardize=True, temperature=temperature)\n", " epochs_num = 10\n", " print(f\" Training RQSplineModel for {model_name} with {epochs_num} epochs...\")\n", " model.fit(chains_train.samples, epochs=epochs_num, verbose=True)\n", "\n", " ev = hm.Evidence(chains_infer.nchains, model)\n", " ev.add_chains(chains_infer)\n", "\n", " ln_evidence_inv = ev.ln_evidence_inv\n", " err_tuple = ev.compute_ln_inv_evidence_errors()\n", " err_ln_evidence_inv = (abs(err_tuple[0]) + abs(err_tuple[1])) / 2\n", "\n", " print(\"kurtosis = {}\".format(ev.kurtosis), \" Aim for ~3.\")\n", " check = np.exp(0.5 * ev.ln_evidence_inv_var_var - ev.ln_evidence_inv_var)\n", "\n", " print(\"sqrt(evidence_inv_var_var) / evidence_inv_var = {}\".format(check))\n", " print(\" Aim for sqrt( 2/(n_eff-1) ) = {}\".format(np.sqrt(2.0 / (ev.n_eff - 1))))\n", " print()\n", " \n", " return -ln_evidence_inv, err_ln_evidence_inv\n", "\n", "\n", "# Compute log-evidence for each model\n", "ln_Z_linear, err_ln_Z_linear = compute_log_evidence(\n", " samples=samples_linear,\n", " log_prob=log_prob_linear,\n", " ndim=2,\n", " model_name=\"linear (order 1)\",\n", ")\n", "ln_Z_quadratic, err_ln_Z_quadratic = compute_log_evidence(\n", " samples=samples_quadratic,\n", " log_prob=log_prob_quadratic,\n", " ndim=3,\n", " model_name=\"quadratic (order 2)\",\n", ")\n", "ln_Z_cubic, err_ln_Z_cubic = compute_log_evidence(\n", " samples=samples_cubic, log_prob=log_prob_cubic, ndim=4, model_name=\"cubic (order 3)\"\n", ")\n", "ln_Z_quartic, err_ln_Z_quartic = compute_log_evidence(\n", " samples=samples_quartic,\n", " log_prob=log_prob_quartic,\n", " ndim=5,\n", " model_name=\"quartic (order 4)\",\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Evidence table\n", "\n", "Compare log-evidence (ln Z) and estimated errors across models." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(\"\\n--- Log-evidence comparison ---\\n\")\n", "print(\"| model | log-evidence (ln Z) | error (ln Z) |\")\n", "print(\"|------------|----------------------|--------------|\")\n", "\n", "row_fmt = \"| {model:<10} | {lnz:>20.3f} | {err:>12.3f} |\"\n", "print(row_fmt.format(model=\"linear\", lnz=ln_Z_linear, err=err_ln_Z_linear))\n", "print(row_fmt.format(model=\"quadratic\", lnz=ln_Z_quadratic, err=err_ln_Z_quadratic))\n", "print(row_fmt.format(model=\"cubic\", lnz=ln_Z_cubic, err=err_ln_Z_cubic))\n", "print(row_fmt.format(model=\"quartic\", lnz=ln_Z_quartic, err=err_ln_Z_quartic))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# --- Helper plotting functions ---\n", "\n", "def plot_posterior_predictives(\n", " samples_dict, x_data, y_data, x_plot, truth_curve, noise_std\n", "):\n", " \"\"\"Plot posterior predictive draws for each model in a 2x2 grid.\n", "\n", " Args:\n", " samples_dict (dict): mapping name->flattened samples (n_samples, ndim)\n", " x_data, y_data: observed data\n", " x_plot: dense x for smooth curves\n", " truth_curve: callable(x) -> y (true noiseless curve)\n", " noise_std: observational noise std for errorbars\n", " \"\"\"\n", " panel_order = [\"linear\", \"quadratic\", \"cubic\", \"quartic\"]\n", " display = {\n", " \"linear\": \"linear (order 1)\",\n", " \"quadratic\": \"quadratic (order 2)\",\n", " \"cubic\": \"cubic (order 3)\",\n", " \"quartic\": \"quartic (order 4)\",\n", " }\n", " colors = {\"linear\": \"C0\", \"quadratic\": \"C1\", \"cubic\": \"C2\", \"quartic\": \"C3\"}\n", "\n", " fig, axes = plt.subplots(2, 2, figsize=(12, 10), sharex=True, sharey=True)\n", " axes = axes.flatten()\n", "\n", " for ax, name in zip(axes, panel_order):\n", " flat = samples_dict.get(name)\n", " ax.errorbar(x_data, y_data, yerr=noise_std, fmt=\".k\", capsize=0)\n", " ax.plot(x_plot, truth_curve(x_plot), \"r--\", label=\"true (noiseless)\")\n", "\n", " if flat is not None and flat.shape[0] > 0:\n", " n_draws = min(25, flat.shape[0])\n", " idx = np.random.choice(flat.shape[0], size=n_draws, replace=False)\n", " for params in flat[idx]:\n", " ax.plot(\n", " x_plot,\n", " model_predict(params, x_plot),\n", " color=colors[name],\n", " alpha=0.18,\n", " )\n", " meanp = np.mean(flat, axis=0)\n", " ax.plot(\n", " x_plot,\n", " model_predict(meanp, x_plot),\n", " color=colors[name],\n", " lw=2,\n", " label=\"posterior mean\",\n", " )\n", "\n", " ax.set_title(display[name])\n", " ax.grid(True)\n", "\n", " fig.suptitle(\"posterior predictive samples (25 draws per model)\", fontsize=18)\n", " plt.tight_layout(rect=(0, 0.03, 1, 0.95))\n", " return fig\n", "\n", "\n", "def plot_model_probabilities(ln_Zs, labels):\n", " \"\"\"Plot normalized model probabilities from ln Z values.\"\"\"\n", " ln_Zs = np.asarray(ln_Zs)\n", " Zs = np.exp(ln_Zs - np.max(ln_Zs))\n", " probs = Zs / np.sum(Zs)\n", "\n", " fig, ax = plt.subplots(figsize=(8, 4))\n", " x = np.arange(len(labels))\n", " ax.bar(x, probs, color=[\"C0\", \"C1\", \"C2\", \"C3\"], capsize=6)\n", " ax.set_xticks(x)\n", " ax.set_xticklabels(labels, ha=\"right\")\n", " ax.set_yticks(np.linspace(0, 1, 6))\n", " ax.set_ylabel(\"Model probability\")\n", " ax.set_ylim(0, 1.1)\n", " for i, p in enumerate(probs):\n", " ax.text(i, p + 0.02, f\"{p:.2f}\", ha=\"center\")\n", " plt.tight_layout()\n", " return fig" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Helper cell to prepare data for plotting\n", "flat_linear = samples_linear.reshape(-1, samples_linear.shape[2])\n", "flat_quadratic = samples_quadratic.reshape(-1, samples_quadratic.shape[2])\n", "flat_cubic = samples_cubic.reshape(-1, samples_cubic.shape[2])\n", "flat_quartic = samples_quartic.reshape(-1, samples_quartic.shape[2])\n", "\n", "samples_for_plot = {\n", " \"linear\": flat_linear,\n", " \"quadratic\": flat_quadratic,\n", " \"cubic\": flat_cubic,\n", " \"quartic\": flat_quartic,\n", "}\n", "\n", "# Dense x for plotting\n", "x_plot = np.linspace(x_data.min(), x_data.max(), 500)\n", "\n", "\n", "def _truth_curve(x):\n", " if truth_model == \"quadratic\":\n", " return model_predict([a0_true, a1_true, a2_true], x)\n", " return model_predict([a0_true, a1_true, a2_true, a3_true], x)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Plot the posterior predictives" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig1 = plot_posterior_predictives(\n", " samples_dict=samples_for_plot,\n", " x_data=x_data,\n", " y_data=y_data,\n", " x_plot=x_plot,\n", " truth_curve=_truth_curve,\n", " noise_std=noise_std,\n", ")\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Plot the model probabilities" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig2 = plot_model_probabilities(\n", " ln_Zs=[ln_Z_linear, ln_Z_quadratic, ln_Z_cubic, ln_Z_quartic],\n", " labels=[\n", " \"linear (order 1)\",\n", " \"quadratic (order 2)\",\n", " \"cubic (order 3)\",\n", " \"quartic (order 4)\",\n", " ],\n", ")\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Can you see if the conclusions change with different prior bounds that you could have reasonably picked?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Part 4. Real World Example - Radiata Pines\n", "\n", "Let's now consider a real world example using everyone's favorite New Zealand tree!\n", "\n", "## The Radiata Pines Dataset\n", "\n", "Radiata pine (*Pinus radiata*) is New Zealand's most important commercial timber species. These fast-growing trees are a cornerstone of New Zealand's forestry industry. \n", "\n", "While you're here for MaxEnt 2025 in Auckland, you might go on a hike and spot a radiata pine. You might even wonder: \"Is density or resin-adjusted density a better predictor of compression strength parallel to the grain?\" Well, today's your lucky day - we're going to use Bayesian model selection to answer exactly that question!\n", "\n", "The dataset consists of measurements of the maximum compression strength parallel to the grain $y_i$, density $x_i$ and resin-adjusted density $z_i$, for specimen $i \\in \\{1, \\ldots, n\\}$. Two Gaussian linear models are compared, one with density and one with resin-adjusted density as variables:\n", "\\begin{align*}\n", "\\text{Model } M_1 &: \\quad y_i = \\alpha + \\beta(x_i - \\bar{x}) + \\epsilon_i, & \\epsilon_i &\\sim \\text{N}(0, \\tau^{-1}); \\\\\n", "\\text{Model } M_2 &: \\quad y_i = \\gamma + \\delta(z_i - \\bar{z}) + \\eta_i, & \\eta_i &\\sim \\text{N}(0, \\lambda^{-1}),\n", "\\end{align*}\n", "where $\\bar{x}$, $\\bar{z}$ denote the mean values of $x_i$ and $z_i$ respectively, and $\\tau$ and $\\lambda$ denote the precision of the noise for the respective models. For both models, Gaussian priors with means $\\mu_\\alpha = 3000$ and $\\mu_\\beta = 185$, and precision scales $r_0 = 0.06$ and $s_0 = 6$ are chosen. A gamma prior is assumed for the noise precision with shape $a_0 = 3$ and rate $b_0 = 2 \\times 300^2$." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's first get the dataset." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "url = \"https://raw.githubusercontent.com/astro-informatics/harmonic/refs/heads/main/examples/data/radiata_pine.dat\"\n", "\n", "radiata_data = np.loadtxt(url)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, let's define the necessary densities." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# ===========================================================================\n", "# Set-up Priors\n", "# ===========================================================================\n", "# Define prior variables\n", "mu_0 = np.array([[3000.0], [185.0]])\n", "r_0 = 0.06\n", "s_0 = 6.0\n", "a_0 = 3.0\n", "b_0 = 2.0 * 300**2\n", "\n", "# Imports data file\n", "id = radiata_data[:, 0]\n", "y = radiata_data[:, 1]\n", "x = radiata_data[:, 2]\n", "z = radiata_data[:, 3]\n", "n = len(x)\n", "\n", "# Ensure column vectors\n", "y = y.reshape(n, 1)\n", "x = x.reshape(n, 1)\n", "z = z.reshape(n, 1)\n", "\n", "# Remove means from covariates.\n", "x = x - np.mean(x)\n", "z = z - np.mean(z)\n", "\n", "def ln_likelihood_pine(y, x, n, alpha, beta, tau):\n", " \"\"\"Compute log_e of Radiata Pine likelihood.\n", "\n", " Args:\n", "\n", " y: Compression strength along grain.\n", "\n", " x: Predictor (density or density adjusted for resin content).\n", "\n", " alpha: Model bias term.\n", "\n", " beta: Model linear term.\n", "\n", " tau: Prior precision factor.\n", "\n", " Returns:\n", "\n", " double: Value of log_e likelihood at specified point in parameter\n", " space.\n", "\n", " \"\"\"\n", "\n", " ln_like = 0.5 * n * np.log(tau)\n", " ln_like -= 0.5 * n * np.log(2.0 * np.pi)\n", "\n", " s = np.sum((y - alpha - beta * x) ** 2)\n", "\n", " ln_like -= 0.5 * tau * s\n", "\n", " return ln_like\n", "\n", "\n", "def ln_prior_alpha(alpha, tau, mu_0, r_0):\n", " \"\"\"Compute log_e of alpha / beta prior (Normal prior).\n", "\n", " Args:\n", "\n", " alpha: Model term (bias or linear term).\n", "\n", " tau: Prior precision factor.\n", "\n", " mu_0: Prior mean.\n", "\n", " r_0: Prior precision constant factor.\n", "\n", " Returns:\n", "\n", " double: Value of log_e prior at specified point in parameter space.\n", "\n", " \"\"\"\n", " ln_pr_alpha = 0.5 * np.log(tau)\n", " ln_pr_alpha += 0.5 * np.log(r_0)\n", " ln_pr_alpha -= 0.5 * np.log(2.0 * np.pi)\n", " ln_pr_alpha -= 0.5 * tau * r_0 * (alpha - mu_0) ** 2\n", "\n", " return ln_pr_alpha\n", "\n", "\n", "def ln_prior_tau(tau, a_0, b_0):\n", " \"\"\"Compute log_e of tau prior (Gamma prior).\n", "\n", " Args:\n", "\n", " tau: Prior precision factor.\n", "\n", " a_0: Gamma prior shape parameter.\n", "\n", " b_0: Gamma prior rate parameter.\n", "\n", " Returns:\n", "\n", " double: Value of log_e tau prior at specified point in parameter\n", " space.\n", "\n", " \"\"\"\n", "\n", " if tau < 0:\n", " return -np.inf\n", "\n", " ln_pr_tau = a_0 * np.log(b_0)\n", " ln_pr_tau += (a_0 - 1.0) * np.log(tau)\n", " ln_pr_tau -= b_0 * tau\n", " ln_pr_tau -= sp.gammaln(a_0)\n", "\n", " return ln_pr_tau\n", "\n", "\n", "def ln_prior_separated(alpha, beta, tau, mu_0, r_0, s_0, a_0, b_0):\n", " \"\"\"Compute log_e of prior (combining individual prior functions).\n", "\n", " Args:\n", "\n", " alpha: Model bias term.\n", "\n", " beta: Model linear term.\n", "\n", " tau: Prior precision factor.\n", "\n", " mu_0: Prior means.\n", "\n", " r_0: Prior precision constant factor for bias term.\n", "\n", " s_0: Prior precision constant factor for linear term.\n", "\n", " a_0: Gamma prior shape parameter.\n", "\n", " b_0: Gamma prior rate parameter.\n", "\n", " Returns:\n", "\n", " double: Value of log_e prior at specified point in parameter space.\n", "\n", " \"\"\"\n", " ln_pr = ln_prior_alpha(alpha, tau, mu_0[0, 0], r_0)\n", " ln_pr += ln_prior_alpha(beta, tau, mu_0[1, 0], s_0)\n", " ln_pr += ln_prior_tau(tau, a_0, b_0)\n", "\n", " return ln_pr\n", "\n", "\n", "def ln_prior_combined(alpha, beta, tau, mu_0, r_0, s_0, a_0, b_0):\n", " \"\"\"Compute log_e of combined prior (jointly computing total prior).\n", "\n", " Args:\n", "\n", " alpha: Model bias term.\n", "\n", " beta: Model linear term.\n", "\n", " tau: Prior precision factor.\n", "\n", " mu_0: Prior means.\n", "\n", " r_0: Prior precision constant factor for bias term.\n", "\n", " s_0: Prior precision constant factor for linear term.\n", "\n", " a_0: Gamma prior shape parameter.\n", "\n", " b_0: Gamma prior rate parameter.\n", "\n", " Returns:\n", "\n", " double: Value of log_e prior at specified point in parameter space.\n", "\n", " \"\"\"\n", " if tau < 0:\n", " return -np.inf\n", "\n", " ln_pr = a_0 * np.log(b_0)\n", " ln_pr += a_0 * np.log(tau)\n", " ln_pr -= b_0 * tau\n", " ln_pr -= np.log(2.0 * np.pi)\n", " ln_pr -= sp.gammaln(a_0)\n", " ln_pr += 0.5 * np.log(r_0)\n", " ln_pr += 0.5 * np.log(s_0)\n", " ln_pr -= (\n", " 0.5 * tau * (r_0 * (alpha - mu_0[0, 0]) ** 2 + s_0 * (beta - mu_0[1, 0]) ** 2)\n", " )\n", "\n", " return ln_pr\n", "\n", "\n", "def ln_prior_pine(alpha, beta, tau, mu_0, r_0, s_0, a_0, b_0):\n", " \"\"\"Compute log_e of combined prior.\n", "\n", " Can be used to easily switch with prior function using (e.g.\n", " ln_prior_separated or ln_prior_combined). There should be (and is) not\n", " difference (both implemented just as an additional consistency check).\n", "\n", " Args:\n", "\n", " alpha: Model bias term.\n", "\n", " beta: Model linear term.\n", "\n", " tau: Prior precision factor.\n", "\n", " mu_0: Prior means.\n", "\n", " r_0: Prior precision constant factor for bias term.\n", "\n", " s_0: Prior precision constant factor for linear term.\n", "\n", " a_0: Gamma prior shape parameter.\n", "\n", " b_0: Gamma prior rate parameter.\n", "\n", " Returns:\n", "\n", " double: Value of log_e prior at specified point in parameter space.\n", "\n", " \"\"\"\n", "\n", " return ln_prior_combined(alpha, beta, tau, mu_0, r_0, s_0, a_0, b_0)\n", "\n", "\n", "def ln_posterior_pine(theta, y, x, n, mu_0, r_0, s_0, a_0, b_0):\n", " \"\"\"Compute log_e of posterior.\n", "\n", " Args:\n", "\n", " theta: Position (alpha, beta, tau) at which to evaluate posterior.\n", "\n", " y: Compression strength along grain.\n", "\n", " x: Predictor (density or density adjusted for resin content).\n", "\n", " n: Number of specimens.\n", "\n", " mu_0: Prior means.\n", "\n", " r_0: Prior precision constant factor for bias term.\n", "\n", " s_0: Prior precision constant factor for linear term.\n", "\n", " a_0: Gamma prior shape parameter.\n", "\n", " b_0: Gamma prior rate parameter.\n", "\n", " Returns:\n", "\n", " double: Value of log_e posterior at specified theta (alpha, beta,\n", " tau) point.\n", "\n", " \"\"\"\n", "\n", " alpha, beta, tau = theta\n", "\n", " ln_pr = ln_prior_pine(alpha, beta, tau, mu_0, r_0, s_0, a_0, b_0)\n", "\n", " if not np.isfinite(ln_pr):\n", " return -np.inf\n", "\n", " ln_L = ln_likelihood_pine(y, x, n, alpha, beta, tau)\n", "\n", " return ln_L + ln_pr\n", "\n", "\n", "def ln_evidence_analytic_pine(x, y, n, mu_0, r_0, s_0, a_0, b_0):\n", " \"\"\"Compute log_e of analytic evidence.\n", "\n", " Args:\n", "\n", " x: Predictor (density or density adjusted for resin content).\n", "\n", " y: Compression strength along grain.\n", "\n", " n: Number of specimens.\n", "\n", " mu_0: Prior means.\n", "\n", " r_0: Prior precision constant factor for bias term.\n", "\n", " s_0: Prior precision constant factor for linear term.\n", "\n", " a_0: Gamma prior shape parameter.\n", "\n", " b_0: Gamma prior rate parameter.\n", "\n", " Returns:\n", "\n", " double: Value of log_e of analytic evidence for model.\n", "\n", " \"\"\"\n", "\n", " Q_0 = np.diag([r_0, s_0])\n", " X = np.c_[np.ones((n, 1)), x]\n", " M = X.T.dot(X) + Q_0\n", " nu_0 = np.linalg.inv(M).dot(X.T.dot(y) + Q_0.dot(mu_0))\n", "\n", " quad_terms = y.T.dot(y) + mu_0.T.dot(Q_0).dot(mu_0) - nu_0.T.dot(M).dot(nu_0)\n", "\n", " ln_evidence = -0.5 * n * np.log(np.pi)\n", " ln_evidence += a_0 * np.log(2.0 * b_0)\n", " ln_evidence += sp.gammaln(0.5 * n + a_0) - sp.gammaln(a_0)\n", " ln_evidence += 0.5 * np.log(np.linalg.det(Q_0)) - 0.5 * np.log(np.linalg.det(M))\n", "\n", " ln_evidence += -(0.5 * n + a_0) * np.log(quad_terms + 2.0 * b_0)\n", "\n", " return ln_evidence" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Get our samples from the posterior." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def get_radiata_samples(model_1=True, nchains=60, samples_per_chain=1500, nburn=1000):\n", " \"\"\"\n", " Run MCMC sampling for radiata pine regression models.\n", " \n", " Args:\n", " model_1: If True, use density (x) as predictor. \n", " If False, use resin-adjusted density (z) as predictor.\n", " nchains: Number of MCMC chains to run.\n", " samples_per_chain: Total number of samples per chain (including burn-in).\n", " nburn: Number of burn-in samples to discard.\n", " \n", " Returns:\n", " samples: Posterior samples array of shape (nchains, nsamples, ndim)\n", " lnprob: Log probability values of shape (nchains, nsamples)\n", " \"\"\"\n", "\n", " # Set up and run sampler.\n", " tau_prior_mean = a_0 / b_0\n", " tau_prior_std = np.sqrt(a_0) / b_0\n", "\n", " # ===========================================================================\n", " # Compute random positions to draw from for emcee sampler.\n", " # ===========================================================================\n", " # Initial positions for each chain drawn from priors\n", " pos_alpha = mu_0[0, 0] + 1.0 / np.sqrt(tau_prior_mean * r_0) * np.random.randn(\n", " nchains\n", " )\n", " pos_beta = mu_0[1, 0] + 1.0 / np.sqrt(tau_prior_mean * s_0) * np.random.randn(\n", " nchains\n", " )\n", " pos_tau = tau_prior_mean + tau_prior_std * (\n", " np.random.rand(nchains) - 0.5\n", " ) # avoid negative tau\n", "\n", " # Concatenate positions\n", " pos = np.c_[pos_alpha, pos_beta, pos_tau]\n", "\n", " # ===========================================================================\n", " # Run Emcee to recover posterior samples\n", " # ===========================================================================\n", " # Select appropriate predictor variable\n", " ndim = 3 # alpha, beta, tau\n", " if model_1:\n", " args = (y, x, n, mu_0, r_0, s_0, a_0, b_0)\n", " else:\n", " args = (y, z, n, mu_0, r_0, s_0, a_0, b_0)\n", " \n", " sampler = emcee.EnsembleSampler(nchains, ndim, ln_posterior_pine, args=args)\n", " rstate = np.random.get_state()\n", " sampler.run_mcmc(pos, samples_per_chain, rstate0=rstate)\n", " \n", " # Extract samples after burn-in\n", " samples = np.ascontiguousarray(sampler.chain[:, nburn:, :])\n", " lnprob = np.ascontiguousarray(sampler.lnprobability[:, nburn:])\n", " \n", " return samples, lnprob" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Train the flow..." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Which model we use\n", "model_1 = True\n", "# initialise random seed\n", "np.random.seed(26)\n", "\n", "samples_pine, lnprob_pine = get_radiata_samples(model_1=model_1)\n", "print(\"Pine samples shape:\", samples_pine.shape)\n", "print(\"Pine lnprob shape:\", lnprob_pine.shape)\n", "chains_pine = hm.Chains(3)\n", "chains_pine.add_chains_3d(samples_pine, lnprob_pine)\n", "chains_train_pine, chains_infer_pine = hm.utils.split_data(\n", " chains_pine, training_proportion=0.5\n", ")\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Spline parameters\n", "temperature = 0.99\n", "epochs_num = 5\n", "n_layers = 10\n", "n_bins = 8\n", "hidden_size = [64, 64]\n", "spline_range = (-10.0, 10.0)\n", "standardize = True\n", "learning_rate = 0.1\n", "model_pine = hm.model.RQSplineModel(\n", " 3,\n", " n_layers=n_layers,\n", " n_bins=n_bins,\n", " hidden_size=hidden_size,\n", " spline_range=spline_range,\n", " standardize=standardize,\n", " temperature=temperature,\n", " learning_rate=learning_rate,\n", " )\n", "model_pine.fit(chains_train_pine.samples, epochs=epochs_num, verbose=True)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Instantiate harmonic's evidence class\n", "ev_pine = hm.Evidence(chains_infer_pine.nchains, model_pine)\n", "\n", "# Pass the evidence class the inference chains and compute the evidence!\n", "ev_pine.add_chains(chains_infer_pine)\n", "ln_inv_evidence_pine = ev_pine.ln_evidence_inv\n", "err_ln_inv_evidence_pine = ev_pine.compute_ln_inv_evidence_errors()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's look at our flow!" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "num_samp = chains_train_pine.samples.shape[0]\n", "samps_compressed_pine = np.array(model_pine.sample(num_samp))\n", "\n", "hm.utils.plot_getdist_compare(chains_train_pine.samples, samps_compressed_pine)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And check the diagnostics..." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print('ln inverse evidence (harmonic) = {} +/- {}'.format(ln_inv_evidence_pine, err_ln_inv_evidence_pine))\n", "print(\"kurtosis = {}\".format(ev_pine.kurtosis))\n", "check = np.exp(0.5 * ev_pine.ln_evidence_inv_var_var - ev_pine.ln_evidence_inv_var)\n", "print(\"sqrt(evidence_inv_var_var) / evidence_inv_var = {}\".format(check))\n", "print(\"Aim for sqrt( 2/(n_eff-1) ) = {}\".format(np.sqrt(2.0 / (ev_pine.n_eff - 1))))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Does this look good? Hmmm... Can you fix it?" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "if model_1:\n", " print(\"Using Model 1 (density as predictor)\")\n", " ln_evidence_analytic_model1 = ln_evidence_analytic_pine(\n", " x, y, n, mu_0, r_0, s_0, a_0, b_0\n", " )\n", " print(\n", " \"ln_evidence_analytic_model1 = {}\".format(ln_evidence_analytic_model1[0][0])\n", " )\n", "else:\n", " print(\"Using Model 2 (resin-adjusted density as predictor)\")\n", " ln_evidence_analytic_model2 = ln_evidence_analytic_pine(\n", " z, y, n, mu_0, r_0, s_0, a_0, b_0\n", " )\n", " print(\n", " \"ln_evidence_analytic_model2 = {}\".format(ln_evidence_analytic_model2[0][0])\n", " )" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Can you calculate the Bayes factor and decide which model is better?" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#YOUR CODE HERE" ] } ], "metadata": { "colab": { "authorship_tag": "ABX9TyOuisDf5/8oQjWLVUhGuTZA", "collapsed_sections": [], "include_colab_link": true, "name": "basic_usage.ipynb", "provenance": [], "toc_visible": true }, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.13.2" } }, "nbformat": 4, "nbformat_minor": 0 }