Jika Anda berniat melakukan banyak statistik Bayesian, Anda akan merasa terbantu untuk mempelajari bahasa BUGS / JAGS, yang dapat diakses dalam R melalui paket R2OpenBUGS atau R2WinBUGS.
Namun, demi contoh cepat yang tidak memerlukan pemahaman sintaks BUGS, Anda bisa menggunakan paket "bayesm" yang memiliki fungsi runiregGibbs untuk pengambilan sampel dari distribusi posterior. Berikut adalah contoh dengan data yang mirip dengan yang Anda jelaskan .....
library(bayesm)
podwt <- structure(list(wt = c(1.76, 1.45, 1.03, 1.53, 2.34, 1.96, 1.79, 1.21, 0.49, 0.85, 1, 1.54, 1.01, 0.75, 2.11, 0.92), treat = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("I", "U"), class = "factor"), mus = c(4.15, 2.76, 1.77, 3.11, 4.65, 3.46, 3.75, 2.04, 1.25, 2.39, 2.54, 3.41, 1.27, 1.26, 3.87, 1.01)), .Names = c("wt", "treat", "mus"), row.names = c(NA, -16L), class = "data.frame")
# response
y1 <- podwt$wt
# First run a one-way anova
# Create the design matrix - need to insert a column of 1s
x1 <- cbind(matrix(1,nrow(podwt),1),podwt$treat)
# data for the Bayesian analysis
dt1 <- list(y=y1,X=x1)
# runiregGibbs uses a normal prior for the regression coefficients and
# an inverse chi-squared prior for va
# mean of the normal prior. We have 2 estimates - 1 intercept
# and 1 regression coefficient
betabar1 <- c(0,0)
# Pecision matrix for the normal prior. Again we have 2
A1 <- 0.01 * diag(2)
# note this is a very diffuse prior
# degrees of freedom for the inverse chi-square prior
n1 <- 3
# scale parameter for the inverse chi-square prior
ssq1 <- var(y1)
Prior1 <- list(betabar=betabar1, A=A1, nu=n1, ssq=ssq1)
# number of iterations of the Gibbs sampler
iter <- 10000
# thinning/slicing parameter. 1 means we keep all all values
slice <- 1
MCMC <- list(R=iter, keep=slice)
sim1 <- runiregGibbs(dt1, Prior1, MCMC)
plot(sim1$betadraw)
plot(sim1$sigmasqdraw)
summary(sim1$betadraw)
summary(sim1$sigmasqdraw)
# compare with maximum likelihood estimates:
fitpodwt <- lm(wt~treat, data=podwt)
summary(fitpodwt)
anova(fitpodwt)
# now for ordinary linear regression
x2 <- cbind(matrix(1,nrow(podwt),1),podwt$mus)
dt2 <- list(y=y1,X=x2)
sim2 <- runiregGibbs(dt1, Prior1, MCMC)
summary(sim1$betadraw)
summary(sim1$sigmasqdraw)
plot(sim$betadraw)
plot(sim$sigmasqdraw)
# compare with maximum likelihood estimates:
summary(lm(podwt$wt~mus,data=podwt))
# now with both variables
x3 <- cbind(matrix(1,nrow(podwt),1),podwt$treat,podwt$mus)
dt3 <- list(y=y1,X=x3)
# now we have an additional estimate so modify the prior accordingly
betabar1 <- c(0,0,0)
A1 <- 0.01 * diag(3)
Prior1 <- list(betabar=betabar1, A=A1, nu=n1, ssq=ssq1)
sim3 <- runiregGibbs(dt3, Prior1, MCMC)
plot(sim3$betadraw)
plot(sim3$sigmasqdraw)
summary(sim3$betadraw)
summary(sim3$sigmasqdraw)
# compare with maximum likelihood estimates:
summary(lm(podwt$wt~treat+mus,data=podwt))
Ekstrak dari output adalah:
Anova:
Bayesian:
Summary of Posterior Marginal Distributions
Moments
mean std dev num se rel eff sam size
1 2.18 0.40 0.0042 0.99 9000
2 -0.55 0.25 0.0025 0.87 9000
Quantiles
2.5% 5% 50% 95% 97.5%
1 1.4 1.51 2.18 2.83 2.976
2 -1.1 -0.97 -0.55 -0.13 -0.041
lm ():
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.6338 0.1651 9.895 1.06e-07 ***
treatU -0.5500 0.2335 -2.355 0.0336 *
Regresi linier sederhana:
Bayesian:
Summary of Posterior Marginal Distributions
Moments
mean std dev num se rel eff sam size
1 0.23 0.208 0.00222 1.0 4500
2 0.42 0.072 0.00082 1.2 4500
Quantiles
2.5% 5% 50% 95% 97.5%
1 -0.18 -0.10 0.23 0.56 0.63
2 0.28 0.31 0.42 0.54 0.56
lm ():
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.23330 0.14272 1.635 0.124
mus 0.42181 0.04931 8.554 6.23e-07 ***
2 model kovariat:
Bayesian:
Summary of Posterior Marginal Distributions
Moments
mean std dev num se rel eff sam size
1 0.48 0.437 0.00520 1.3 4500
2 -0.12 0.184 0.00221 1.3 4500
3 0.40 0.083 0.00094 1.2 4500
Quantiles
2.5% 5% 50% 95% 97.5%
1 -0.41 -0.24 0.48 1.18 1.35
2 -0.48 -0.42 -0.12 0.18 0.25
3 0.23 0.26 0.40 0.53 0.56
lm ():
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.36242 0.19794 1.831 0.0901 .
treatU -0.11995 0.12688 -0.945 0.3617
mus 0.39590 0.05658 6.997 9.39e-06 ***
dari mana kita dapat melihat bahwa hasilnya sebanding secara luas, seperti yang diharapkan dengan model-model sederhana dan prior diffuse. Tentu saja ada baiknya juga memeriksa plot diagnostik MCMC - kepadatan posterior, plot jejak, korelasi otomatis - yang saya juga berikan kode di atas yang mana (plot tidak ditampilkan).