I am using pymc3
package for Bayesian modelling in Python. Below is my code
import pymc3 as pm
with pm.Model() as model :
mu = pm.Normal("mu", mu = 0, sigma = 1)
obs = pm.Normal("obs", mu = mu, sigma = 1, observed = np.random.randn(100))
model.logp({"mu": 0})
The logp
method above gives a result as array(-149.24174903)
Could you please help me to understand what this number is referring to? Is it log-likelihood function? I also checked with below but could not match this number
import scipy.stats
import numpy as np
np.log(scipy.stats.norm(0, 1).pdf(0)) ### -0.9189385332046727
CodePudding user response:
The logp
method should give you the unnormalized log posterior, i.e. the (log) numerator of Bayes' rule. Recall that the posterior is proportional to the product of the prior and the likelihood and the log posterior is proportional to the sum of the log prior and the log likelihood. I.e. if you want to reproduce the output of logp
you also have to consider the likelihood, not only the prior. You can check it like that:
import pymc3 as pm
import scipy.stats
import numpy as np
# declare observed data above to check later
data = np.random.randn(100)
# that's the parameter value for which you want the unnormalized log posterior density
fixed_mu = 0
with pm.Model() as model :
mu = pm.Normal("mu", mu=0, sigma=1)
obs = pm.Normal("obs", mu=mu, sigma=1, observed=data)
# unnormalized log posterior as given by pymc3
pm_log_posterior = model.logp({"mu": 0})
# log prior as given by scipy
np_log_prior = scipy.stats.norm(0, 1).logpdf(fixed_mu)
# log likelihood as given by scipy
np_log_likelihood = scipy.stats.norm(fixed_mu, 1).logpdf(data).sum()
# unnormalized posterior is the sum
np_log_posterior = np_log_likelihood np_log_prior
Now np_log_posterior
and pm_log_posterior
should be the same.