Issues of multiplicity in testing are increasingly being encountered in a wide range
of disciplines, as the growing complexity of data allows for consideration of a multitude of possible hypotheses (e.g., does gene xyz affect condition abc). Failure to
properly adjust for multiplicities is being blamed for the apparently increasing lack
of reproducibility in science. The main purpose of this presentation is to review
the different types of multiplicities that are encountered, and to discuss the general approaches to dealing with them that are being adopted by Bayesians. Issues
that I found surprising will be highlighted, such as the fact that empirical Bayesian
approaches to multiplicity adjustment can be seriously flawed.
Applied statisticians often have to deal with modelling discrete-valued response variable,
in particular binary, multinomial or count data, in terms of covariates. These
models typically take the form of (dynamic) generalized linear models involving
latent variables, like mixed effect models, or state space models. Parameter estimation
for these types of models is known to be computationally demanding and
sophisticated numerical techniques have been applied like importance sampling or a
Metropolis-Hastings algorithm.
This talk will review as simple method for the MCMC estimation of such models
based on auxiliary mixture sampling. Auxiliary mixture sampling allows straightforward
estimation for rather general parameter-driven models for discrete-valued
data like random effect models, mixture models or state space models.
Furthermore, it will be demonstrated that auxiliary mixture sampling is particularly
useful for implementing model space MCMC methods for such model. Applications
to variable selection for logistic regression models, covariance selection
for non-Gaussian random effects models and model specification for a Poisson state
space model will be discussed.
A Theory of (Un)congeniality (between Bayesians and
Frequentists?) Xiao-Li Meng
A grand challenge in producing public-use databases is that the models/assumptions
used to “clean up” the raw data cannot possibly be compatible with all subsequent
models or assumptions adopted by the users of the database. This challenge requires
us to rethink the usual “My model” vs. “God’s Model” paradigm, because there is
a third model: the one adopted by the “data cleaner.” A concrete example of this
arises in the context of using multiple imputation to “fix the holes” in raw data.
Multiple imputation, in general, is best done by the data collector via posterior prediction,
which properly takes into account the uncertainty in predicting the missing
values. Yet many users of the resulting data do not even consider a likelihood,
let alone Bayesian modeling, but rather employ a design-based complete-data procedure.
This talk first reviews the concept of congeniality (Meng, 1994, Statistical
Science) for embedding the user’s procedure into a Bayesian model and hence making
it possible to study the incompatibility (i.e., uncongenality) between the Bayesian
imputation model and the frequentist analysis procedure. We then present a newly
established theoretical framework for quantifying the impact of uncongenality on
the resulting multiple imputation inference: an offspring of the uncongenial but
necessary marriage between Bayesian (via the imputer’s model) and frequentist (via
the user’s analysis procedure) machineries. (This is a joint work with Xianchao Xie
of Harvard University.)
Elicitation Tony O’Hagan
This presentation will cover some recent research in the area of eliciting probability
distributions. Although subjective prior distributions lie at the heart of Bayesian
statistics, eliciting them has often received far less attention than it deserves. The
first part of the talk will address the process of elicitation, and will illustrate a
package called SHELF that is designed to help with conducting an elicitation session
in a sound and transparent way. SHELF deals only with eliciting distributions for
single variables, and the second part of the talk will look at the issues around
multivariate elicitation.
Bayesian inference for diffusions Gareth Roberts
This talk will survey recent advances in the area of Bayesian inference for diffusions, focusing mainly on the case of discretely observed data. Methodology relies
heavily on MCMC, and one focus will be on the imputation of unobserved diffusion
bridges running between consecutive observations. The use of “exact” (ie free from
discretisation error) methods and “perfect exact” methods will also be described.
Theory of MCMC: What is it Good For Jeffrey Rosenthal
MCMC algorithms are widely used in Bayesian analysis, but the Markov chain
theory that underlies their validity and effectiveness is often ignored. Is this absence
of theory a good thing? A great tragedy? Somewhere in between?
This talk will discuss ways in which theoretical considerations can genuinely improve the application of MCMC algorithms to statistical inference problems. We
will consider such topics as: ergodicity via phi-irreducibility; geometric ergodicity;
central limit theorems; efficiency orderings; optimal scaling; and adaptive MCMC.
The ideas will be illustrated with simple examples, including live java simulations.
The emphasis throughout will be on ways in which applied users of MCMC might
– and might not – benefit from understanding the theory.
::
Invited Speaker Abstracts::
A Bayesian Semi-Parametric Survival Model with
Longitudinal Markers Kim-Anh Do, Peter Mueller and Song Zhang
We consider inference for data from a clinical trial of treatments for metastatic
prostate cancer. Patients joined the trial with diverse prior treatment histories.
The resulting heterogeneous patient population gives rise to challenging statistical
inference problems when trying to predict time to progression on different treatment
arms. Inference is further complicated by the need to include longitudinal marker as
a covariate. We develop a semi-parametric model for joint inference on longitudinal
data and an event time, with the possibility that some patients are cured. The event
time distribution is based on a non-parametric Plya tree prior. For the longitudinal
data we assume a mixed effects model. Incorporating a regression on covariates
in a non-parametric event time model in general, and for a Plya tree model in
particular, is a challenging problem. We exploit the fact that the covariate itself
is a random variable. We achieve an implementation of the desired regression by
factoring the joint model into a marginal model for the event time and a regression
of the longitudinal outcomes on the event time, i.e., we implicitly model the desired
regression by modeling the reverse conditional distribution.
An adaptive Monte Carlo approach for statistical models with intractable normalizing constants Yves Atchade
This talk will report on a general adaptive MCMC strategy to handle statistical
models with intractable normalizing constants. The method can be thought as an extension of the MCMC-MLE approach (Geyer and Thompson 1992) to handle similar models by maximum likelihood.
I will give some applications in image segmentation, social network inference and statistical protein design.
Hierarchical Bayesian Functional Data Analysis and its
applications Veera Baladandayuthapani
In many scientific application, one observes responses measured over a grid of discrete values, termed functional observations. In this talk I will talk about a couple
of novel functional data analytic methods motivated by real oncology experiments.
In the first case, the responses are inherently functional and are hierarchically correlated. In the second case, it of interest to model the effect of a functional covariate
on a scalar response.
Divergence Based Priors for Objective Bayesian Model
Selections M.J. Bayarri, G. García-Donato
One of the main difficulties for objective Bayesian model selection is that usual
objective priors can not be used, since they are improper. They can be used for
parameters occurring in all of the models and having the same meaning across
models, but otherwie proper priors have to be used. This is true in particular for
the ‘extra’ parameter(s) in the alternative hypothesis.
There are several methods to derive objective but proper prior for model selection.
One of the most popular is Jeffrey’s proposal, later generalized by Zellner and Siow,
for the Normal Linear models context. Less known is that Jeffreys pointed some
ideas to extend this prior to general scenarios, perhaps because it was not really
pursued, and Jeffrey’s first attempt was not very succesful.
In this talk we follow Jeffrey’s hint and develop (objective) proper prior distributions for hypothesis testing and model selection based on measures of divergence
between the competing models; we call them divergence based (DB) priors . DB
priors have simple form and are shown to have desirable properties, like information
(finite sample) consistency; often, they are similar to other existing proposals like
the intrinsic priors; moreover, in normal linear models scenarios, they exactly reproduce Jeffreys-Zellner-Siow priors. Most importantly, in challenging scenarios such
as irregular models and mixture models, the DB priors are well defined and very
reasonable, while alternative proposals are not. We derive approximations to the DB
priors as well as MCMC and asymptotic expressions for the associated Bayes factors,
which also reveals interesting connections with other proposals (like the unit information priors). The paper is available at http://ftp.stat.duke.edu/WorkingPapers/06-
23.pdf
Analysis of Marked Point Patterns with Spatial and
Non-spatial Covariate Information Bradley P. Carlin, Shengde Liang and Alan Gelfand
The analysis of spatial point process data has historically been plagued by computational difficulties. Likelihoods feature intractable integrals that require approximation. This problem is exacerbated when such models are incorporated in a fully
hierarchical framework, since this nests the integrals within a Markov chain Monte
Carlo (MCMC) algorithm. We extend customary spatial point pattern analysis
in the context of a log-Gaussian Cox process model to accommodate spatially referenced covariates, individual-level risk factors, and individual-level covariates of
interest that mark the process. We also use multivariate process realizations to capture dependence among the intensity surfaces across the marks. We illustrate using
a collection of breast cancer case locations collected over the mostly rural northern
part of the state of Minnesota that are marked by their treatment selection, mastectomy or breast conserving surgery (“lumpectomy”), which is less disfiguring but
requires 6 weeks of follow-up radiation therapy. The key substantive covariate (driving distance to the nearest radiation treatment facility) is spatially referenced, but
other important covariates (notably age and stage) are not. Our approach facilitates
mapping and boundary analysis (“wombling”) of the marginal log-relative intensity
surfaces for the two treatment options, and resolves the issue of whether women who
face long driving distances are significantly more likely to opt for mastectomy while
still accounting for all sources of spatial and nonspatial variability in the data.
Bayesian Evaluation of Multiple-regime Nonlinear
Volatility Models Cathy W. S. Chen, Richard H. Gerlach and Ann M.H. Lin
A multiple-regime nonlinear volatility model with a fat-tailed error distribution is
discussed. Bayesian estimation and inference is considered for this model, as well
as Bayesian posterior model comparison among competing volatility models with
different numbers of regimes. An adaptive MCMC sampling scheme is designed
whose output achieves these goals. Our modeling framework provides a parsimonious
representation of well-known stylized features of financial time series and facilitates
statistical inference in the presence of high or explosive persistence and conditional
heteroskedasticity. We focus on the three-regime case: the main feature of the
model is allowing an explosive volatility regime and capturing mean and volatility
asymmetries in financial markets. We illustrate our findings via simulation and an
empirical study of eight international oil gas index markets. Most markets strongly
support the three-regime model over its competitors.
Importance Sampling of Word Patterns in DNA and
Protein Sequences Louis H. Y. Chen
The use of Monte Carlo evaluation to compute p-values of pattern counting test
statistics is especially attractive when an asymptotic theory is absent or when the
search sequence or the word pattern is too short for an asymptotic formula to be accurate. The drawback of applying Monte Carlo simulations directly is its inefficiency
when p-values are small, which precisely is the situation of importance. We provide a
general importance sampling algorithm for efficient Monte Carlo evaluation of small
p-values of pattern counting test statistics and apply it on word patterns of biological interest, in particular, palindromes and inverted repeats, patterns arising from
position specific weight matrices, as well as co-occurrences of pairs of motifs. We
also show that our importance sampling technique satisfies a log efficient criterion.
This paper is a joint work with Hock Peng Chan and Nancy Ruonan Zhang.
How Bayes is changing environmental science:
Application to climate change and the biodiversity paradox Jim Clark
The combined advantages of graphical modeling and Bayesian inference are transforming environmental science. Ecological processes are highly scale- and setting-
dependent and only indirectly related to observations. Ecological data are remarkably heterogeneous, ranging from photosynthetic rate measurements on leaves to
remote sensing of landscapes. By modern standards, the problems are not overwhelmingly largethey are frustratingly complex. I illustrate some of the potential
to address complex relationships, together with new issues that emerge, when modern tools are applied to these dynamic, highly connected systems. The application
concerns a long-standing effort to understand controls on forest diversitydescribed a
half century ago as a paradoxin the context of contemporary rapid climate change.
Theoretical models tell us that species must differ in specific ways in order to coexist as stable ecological communities. These differences must involve tradeoffs among
species to insure that the best competitors do not drive all others to extinction.
Yet many coexisting species do not appear to possess such differences. The lack of
observable tradeoffs presents a paradox when taken in light of the fact that species
do indeed coexist in nature. A key challenge involves identifying the important differences among species that allow each to persist in the face of competition from
many others. In this talk I discuss why inconsistent assumptions of theoretical and
statistical models lead to the paradox. I suggest that coexistence is best understood
in terms of population heterogeneity, an old idea, but one that has not been properly
interpreted from data or theory. First, species differences occur along many axes and
are missed by traditional data modeling approaches. Second, theoretical (stochastic) models contain unrecognized species differences, which are simply hidden from
view. I show how a proper treatment of heterogeneity resolves the paradox. Species
differences responsible for coexistence are high-dimensional, and can be captured
in data and in models by admitting an appropriate structure for unknowns. Much
of the unexplained variation in data results from differences among individuals. It
occurs across a large number of axes, most of which will be hard to capture in simple experiments and observational data sets. By providing a consistent treatment
of information from many scales and complex, interacting processes, modern Bayes
allows a more integrated view. The example involves parameterization of the joint
distribution of demographic rates (fecundity, dispersal, growth, mortality) for all of
the trees in selected forests as coupled, non-linear state space models that accommodate the interactions among individuals and with their local environments. We are
fitting the forest and the trees. Prediction of biodiversity response to climate change
involves mixing over all sources of heterogeneity, allowing us to better anticipate
vulnerabilities. Specific issues include ways to organize an analysis that can involve
updating at several stages, how to weight information deriving from multiple data
sources, and efficient computation when conditional relationships are complex and
spatio-temporal.
Bayesian nonparametric variable selection and
hypothesis testing David Dunson
In a broad variety of applications, interest focuses on assessing the relationship between one or more response variables, Y, and predictors, X. Our interest focuses
on Bayesian methods for selecting predictors having any effect on the conditional
response distribution of Y without incorporating parametric assumptions, either in
the mean or the residual distribution. We first describe a variety of nonparametric
priors for the conditional distribution, including dependent Dirichlet processes and
kernel stick-breaking processes. These priors are then adapted to allow uncertainty
in the subset of the predictors to be included in the model, while also allowing
hypothesis testing. To accommodate one-sided hypothesis testing and estimation
problems in the nonparametric paradigm, we also describe methods for incorporating stochastic ordering constraints with continuous and/or categorical predictors.
Theoretical properties are discussed briefly and efficient MCMC algorithms are developed for posterior computation. The methods are illustrated using epidemiologic
applications.
Clustering and Variable Selection using Fisher
distributions Yanan Fan
High dimensional data such as those found in data mining and microarray gene
expression experiments are often inherently directional. We present a novel approach
to model-based clustering of high dimensional data via the use of a mixture of Fisher
distributions. The proposed method carries out simultaneous variable selection and
clustering. The resulting clustering depends on the amount of correlation between
observations given the selected variables. A Bayesian approach is adopted, where
the determination of the number of clusters, cluster allocation and variable selection
is carried out simultaneously via the use of trans-dimensional Markov chain Monte
Carlo.
Dynamic Nonlinear Quantile estimation with the
Asymmetric Laplace Distribution Richard Gerlach and Cathy W.S.Chen
Value-at-Risk (VaR) forecasting is required by all financial institutions via the Basel
II Capitol Accord. The VaR is simply proportional to a quantile in the relevant forecast distribution. Engle and Manganelli (2004) proposed to use quantile regression
to model the VaR directly, introducing the Conditional Autoregressive Value at Risk
(CAViaR) model. Recent work shows that such dynamic quantile estimation is a
special case of maximising the asymmetric or skewed Laplace distribution likelihood.
The question then arises as to the feasibility of extending this result to likelihood,
and then Bayesian, inference; a question which has been partially answered in cross-sectional regression models by Yu and Moyeed (2001) and Geraci and Bottai (2007).
We extend this work by designed an adaptive MCMC sampling scheme, combining
random walk and independent Metropolis-Hastings methods, to provide and assess
parameter estimates and inference for CAViaR-type models, exploiting the skewed
Laplace pseudo-likelihood framework. Further, we extend the CAViaR framework
to include more flexible nonlinear models, e.g. to better capture asymmetry in financial markets. Simulation results show promising results. Finally we apply our
methods to a range of international stock market indices and provide a thorough
comparison with modern symmetric and nonlinear GARCH-type models, as well as
the popular RiskMetrics method, in terms of forecasting the VaR, and quantiles in
general, dynamically.
Models and Inference for Musical Structure Analysis Simon Godsill, Taylan Cemgill, Paul Peeling and Nick Whitely
In this paper we describe recent advances in computer understanding of musical
audio signals. The objectives of the work are to extract high-level and meaningful
information from musical audio recordings in the form of such things as musical
pitch, timbre, timing and instrument identity. These tasks are of use in themselves,
but they also feed into other related tasks such as automated remixing, source
separation and score-based alignment. This is a highly complex class of problems,
and one which can currently only be performed accurately by trained musicians.
In our research we propose Bayesian hierarchical models which represent (at the
highest level) the musical score and at the lowest level generative models for the
measurement of raw audio data, as captured and digitised through one or more
microphones. The models attempt to capture the dynamics of the musical score in
a generic and musically meaningful fashion, favouring both likely transitions over
time and likely groupings of notes at a given time. These models are connected to
lower level signal models, based on either expensive time-domain oscillator models or
cheaper point process and Gaussian models in the time-frequency plane. Inference is
of course a complex task and we discuss adaptations of MCMC, variational methods
and sequential Monte Carlo work. For links to our recent work in this area see www-sigproc.eng.cam.ac.uk/˜sjg and www-sigproc.eng.cam.ac.uk/˜atc27.
Explorations in and with sparse latent factor analysis Peter J. Green and Sylvia Richardson
Classical latent factor analysis seeks to discover patterns of dependence in multivariate data that allow dimension reduction through the representation of the observed
variables as linear combinations of a smaller number of unobserved 'factors'. We are interested in finding sparse representations, in which there are many zero coefficients among the linear coefficients, in the interests of parsimony, interpretability, and statistical stability; we use a Bayesian hierarchical modelling approach.
Specifically, we examine the situation where there are two or more groups of variables, neither low in dimension, and the main interest is in discovering sparse representations of the dependence between them. We develop a strategy that structures
the patterns of dependence, and explores the model space allowing the numbers of common and specific factors to vary.
We are motivated by a study relating profiling of metabolites with transcript and
enzyme activity, and illustrate the statistical and computational performance of our methodology, and its sensitivity to prior assumptions, on both these data and a
variety of simulated set-ups.
Multilevel adaptive sampling for inverse problems Dave Higdon, Dave Moulton and Todd Graves
Over the past few decades, efficient and robust multilevel solvers have been developed for a variety of applications which range from medical tomography to flow
in porous media. Recent success of these multilevel solvers is due to the development of general multiscale concepts such as operator-induced variational coarsening.
This approach implicitly treats the multiscale aspects of the fine-scale model in its
generation of successively coarser representations.
Clearly such solvers can be used as a “black box” (within an MCMC scheme, for
example) for inferring unknown parameters or initial conditions in inverse problems.
While computational efficiency has been the primary motivation for the development
of such multilevel solvers, it is hard to resist the temptation of prying into these
solvers so that the coarser representations can be used to help guide the posterior
sampling. In this talk we explore sequential and Markov chain Monte Carlo methods
for exploiting the implicit coarsened representations within a mulitlevel solver to
speed up posterior sampling.
Pooling information across matrix decompositions Peter Hoff
One approach to summarizing relational and other matrix-valued data is with low-
rank matrix approximations. For example, the variation among the entries of a
symmetric n×n data matrix Y is often expressed with the eigenvalue decomposition
model Y ~ U ΛUT +E, where U is an n×k orthonormal matrix and Λ is a diagonal
matrix. In this work we consider pooling information across multiple such data
matrices Y (1), . . . , Y (p) for situations in which the common cells across matrices
{y(1)i,j , . . . , y(p)i,j } represent repeated or multivariate measurements under a common
set of conditions {i, j}. This is accomplished by estimating the parameters in a
model for the variability among the orthonormal eigenvector matrices U(1), . . . ,U(p)
of the p data matrices. The model is based on a variation of the matrix Langevin
distribution, for which estimation is accomplished primarily with Gibbs sampling.
The methodology is applied to the analysis of multivariate relational data where
{y(1)i,j , . . . , y(p)i,j } represent multiple dyadic relations between nodes i and j.
Inferring regions of copy number variation (CNVs) in
human DNA from SNP genotyping data using objective Bayesian signal processing methods Chris Holmes and Chris Yau
Recent discoveries suggest that regions of copy number variation (CNVs) in the human genome are much more widespread than previously thought. A CNV is defined as a segment of DNA > 1 kb that is present at a variable copy number in comparison
to a reference genome. It is believed that up to 10% of the human genome maybe
copy number variable (contributing to around 10% of genetical transcription variation) and copy number polymorphisms have been linked to a number of diseases. In recent work we have developed an objective Bayesian HiddenMarkov model to detect regions of copy number variation from genome-wide single nucleotide polymorphism (SNP) genotyping data (of around 500,000 SNPs). In our model the hidden states
refer to unobserved copy number variants at a locus (SNP) and the transitions between states capture the persistence within CNV states across chromosomal regions.
In certain samples, such as from tumour biopsies, tissue heterogeneity introduces
additional complications requiring a mixture deconvolution. Predictions from the
model have been experimentally validated on a number of samples. We report the
findings from a number of large studies including 1500 samples from the 1958 UK
birth cohort and a genome-wide association study of childhood malaria risk in an
African population. This is joint work with the Wellcome Trust Centre for Human
Genetics.
Flexible Multivariate Regression Density Estimation Robert Kohn
We develop flexible multivariate regression density estimators and use them for
modeling multivariate continuous and discrete data.
A Flexible Approach to Parametric Inference in
Nonlinear Time Series Models Gary Koop and Simon Potter
Many structural break and regime-switching models have been used with macroeconomic and financial data. In this paper, we develop an extremely flexible parametric
model which can accommodate virtually any of these specifications - and does so
in a simple way which allows for straightforward Bayesian inference. The basic
idea underlying our model is that it adds two simple concepts to a standard state
space framework. These ideas are ordering and distance. By ordering the data in
various ways, we can accommodate a wide variety of nonlinear time series models, including those with regime-switching and structural breaks. By allowing the
state equation variances to depend on the distance between observations, the parameters can evolve in a wide variety of ways, allowing for everything from models
exhibiting abrupt change (e.g. threshold autoregressive models or standard structural break models) to those which allow for a gradual evolution of parameters (e.g.
smooth transition autoregressive models or time varying parameter models). We
show how our model will (approximately) nest virtually every popular model in the
regime-switching and structural break literatures. Bayesian econometric methods
for inference in this model are developed. Because we stay within a state space
framework, these methods are relatively straightforward, drawing on the existing
literature. We use artificial data to show the advantages of our approach, before
providing two empirical illustrations involving the modeling of real GDP growth.
Treed Gaussian Processes and Pattern Search
Optimization Herbie Lee and Matt Taddy
Asynchronous Parallel Pattern Search is a derivative-free numerical optimization
algorithm. We show how to combine pattern search with Treed Gaussian Processes
(TGP) to produce a more robust method for maximization or minimization by incorporating
information about our uncertainty. The TGP emulator can also provide
sensitivity analysis and a probabilistic analysis of convergence. Our approach is particularly
useful for physical experiments or complex computer modeling problems,
where each new datapoint or function evaluation may be expensive to obtain. We
demonstrate results on both synthetic and real data.
NonParametric Measurement of A Time-Varying
Volatility Risk Premium: A Bayesian Particle Filtering Approach Nan Qu, Gael M. Martin and Catherine S. Forbes
Non-parametric measures of spot and option-implied volatility are used to extract
real-time estimates of a time-varying volatility risk premium. The inferential method
is Bayesian, with particle filtering used to cater for the non-linearities in the state
space specification.
Fully Non-parametric Bayesian Ensemble Modelling Hugh A. Chipman, Edward I. George and Robert E. McCulloch
Suppose we would like to learn the relationship between y and a high dimensional
vector x based on a limited number of observations. In "BART: Bayesian Additive
Regression Trees" (2006), Chipman, George and McCulloch develop a fully Bayesian
approach for discovering and drawing inference about an unknown function ƒ based
only on assuming y = ƒ(x)+ε with iid normal errors. In the spirit of "ensemble models", BART approximates ƒ by a sum of many simple regression tree models, each
of which are kept small with a strong regularization prior. In terms of out-of-sample
prediction, BART’s performance compares favorably with competing methods. Posterior evaluation by a well-mixing MCMC algorithm allows for the natural Bayesian
quantification of uncertainty about ƒ. Further, the modular nature of BART facilitates its embedding within larger hierarchical models (for example, see Zhang, Shih
and Mueller 2006).
In this work, we further extend the flexibility of the BART approach by relaxing
the simple iid normal error specification and replacing it with a Dirichlet process
model for the errors. Various specification and prior choices are explored. The costs
as well as the benefits of this more flexible approach are illustrated.
Adaptive Multiple Importance Sampling: AMIS Algorithm Antonietta Mira
The strength of AMIS resides in its completely adaptive and multi-purpose nature:
no tuning parameter is needed and the same algorithm is proved to perform well on
very diverse high-dimentional target distributions (from banana shaped to mixture
with very well separated modes and tunnels).
The algorithm has both a temporal, T, and a population, N, dimention and consists
of 3 steps: initialization, adaptation and clustering. The AMIS estimator is obtained
by recycling the N X T particles generated in all 3 steps, with the corresponding
importance weights. What drives AMIS and ensures unbiasedness, are importance
sampling type of reasonings.
Variance reduction is achieved by Rao-Blackwellisation and by a novel way of combining
different importance distributions by a deterministic mixture and an actualization
process performed on the weights. As a byproduct of these processes, all
particles are on the same \weighting scale" and can be easily and efficiently combined
to get the final AMIS estimator.
In the first step of the algorithm a good scale parameter for the initial importance distribution
is found using the effective sample size of the importance weights. Global
adaptation of the mean and variance of the importance distribution to the corresponding
target parameters (2nd step) is combined with local adaptation achieved
via a Rao-Blackwellised clustering algorithm (3rd step).
In the talk the AMIS algorithm will be presented, together with examples of its
application and performance evaluations/comparisons.
This is joint work with Christian Robert and Jean-Michel Marin.
Old and new auxiliary variable methods for
Metropolis-Hastings algorithms for distributions with intractable normalizing constants Jesper Møller and Robert Reeves
Suppose that we want to simulate from the posterior density for the parameter θ
given data y, with prior p(θ) and likelihood fθ(y) = hθ(y)/cθ, where the normalizing
constant cθ is intractable. Thus the posterior density
p(θ|y) ∝ p(θ)hθ(y)/cθ
is not computable. In an ordinary Metropolis-Hastings algorithm for drawing samples from the posterior distribution the acceptance probability depends on the “unknown” ratio of normalizing constants cθ/cθ'. Most methods to date have used various approximations to estimate or eliminate such ratios of normalizing constants.
In Møller et al. (Biometrika, 2006, pages 451-458) we present a new Metropolis-
Hastings algorithm for drawing samples from the posterior distribution without approximation. It is called the auxiliary variable method, since we extend the posterior
distribution by introducing a certain auxiliary variable so that the acceptance probability can be computed. The auxiliary variable method is a nice application example
of perfect simulation algorithms, and it has e.g. been used for Bayesian analysis of
Gibbs models (Markov random fields and Markov point processes). Moreover, the
auxiliary variable method has more recently been modified and extended to more
efficient MCMC algorithms.
Bayesian Inference for High Dimensional Functional and
Image Data using Functional Mixed Models Jeffrey S. Morris
High dimensional, irregular functional data are increasingly encountered in scientific research. For example, MALDI-MS yields proteomics data consisting of one-
dimensional spectra with many peaks, 2D gel electrophoresis and LC-MS yield two-
dimensional images with spots that correspond to peptides present in the sample,
and array CGH or SNP chip arrays yield one-dimensional functions of copy number
information along the genome. In this talk, I will discuss how to identify candidate biomarkers for various types of proteomic and genomic data using Bayesian
wavelet-based functional mixed models. This approach models the functions in their
entirety, so avoid reliance on peak or spot detection methods. The flexibility of this
framework in modeling nonparametric fixed and random effect functions enables it
to model the effects of multiple factors simultaneously, allowing one to perform inference on multiple factors of interest using the same model fit, while adjusting for
clinical or experimental covariates that may affect both the intensities and locations
of the peaks and spots in the data. I will demonstrate how to identify regions of
the functions that are differentially expressed across experimental conditions, in a
way that takes both statistical and clinical significance into account and controls
the Bayesian false discovery rate to a pre-specified level. Time allowing, I will also
demonstrate how to use this framework as the basis for classifying future samples
based on their proteomic and genomic profiles in a way that can also combine information across multiple sources of data, including proteomic, genomic, and clinical.
These methods will be applied to a series of proteomic and genomic data sets from
cancer-related studies.
Sensitivity of inference in Bayesian networks to
assumptions about founders in forensic genetics Peter Green and Julia Mortera
Bayesian networks, with inferences computed by probability propagation methods,
offer an appealing practical modelling framework for structured systems involving
discrete variables in numerous domains, including forensic genetics. However, when
standard assumptions are violated - for example when allele frequencies are unknown, there is identity by descent or the population is heterogeneous, dependence
is generated among founding genes, that makes exact calculation of conditional
probabilities by propagation methods less straightforward. Here we illustrate different methodologies for dealing with these problems by assessing sensitivity to
assumptions about founders in forensic genetics problems. These methods comprise
constrained steepest descent, linear fractional programming and representing dependence by structure. We illustrate these methodologies on several real case-work
forensic genetics examples comprising criminal identification, simple and complex
disputed paternity and DNA mixtures.
A Bayesian approach to model a multi-state Markov
model from interval-censored data Paul J Mostert and Chris J.B. Muller
In studies of disease stages and their relation to survival, data are usually obtained
at infrequent time points during follow-up. At these points, the clinical status of a
patient can be assessed and as a consequence be distinctly categorised using other
covariates and in many cases a subjective clinical classification by a medical researcher. In its simplest form these categories can be dead or alive or even extended
to, for example, stage I, II, III or IV of HIV/AIDS by using clinical markers. Actual
changes of the clinical stages occur normally between two successive follow-up times.
A disadvantage of not taking this censoring into consideration in model building,
may lead to severe over- or underestimation of the actual time spent in the different
stages. The time patients stay in the different stages of a disease can be an indication of the effectiveness of a drug in stemming the spread of the disease. A Markov
model is assumed to assess the rate at which patients move from one stage to another, given a set of covariates during follow-ups. A Bayesian approach is followed
to model the transition states between actual time points, using a Dirichlet process. This Bayesian approach involves that the probability element corresponding
to each patient’s contribution to the likelihood is altered according to a Dirichlet
process prior. Different approaches in altering the probability contributions are
proposed and compared by means of posterior analyses of the transition rates in a
non-parametric setting. A paediatric HIV dataset obtained from a large academic
hospital in South Africa is used to illustrate the results.
Bayesian Synthesis Q. Yu, S.N. MacEachern and M. Peruggia
Bayesian model averaging enables one to combine the disparate predictions of a
number of models in a coherent fashion, leading to superior predictive performance.
The improvement in performance arises from averaging models that make different predictions. In this work, we tap into perhaps the biggest driver of different
predictions—different analysts—in order to gain the full benefits of model averaging. In a standard implementation of our method, several data analysts work
independently on portions of a data set, eliciting separate models which are eventually updated and combined through Bayesian synthesis. The methodology helps to
alleviate concerns about the sizeable gap between the foundational underpinnings
of the Bayesian paradigm and the practice of Bayesian statistics.
We provide theoretical results that characterize general conditions under which data-
splitting results in improved estimation which, in turn, carries over to improved
prediction. These results suggest general principles of good modeling practice. In
experimental work we show that the method has predictive performance superior
to that of many automatic modeling techniques, including AIC, BIC, Smoothing
Splines, CART, Bagged CART, Bayes CART, BMA, BART and LARS. Compared
to competing modeling methods, the data-splitting approach 1) exhibits superior
predictive performance for real data sets and simulations; 2) makes more efficient
use of human knowledge; 3) selects sparser models with better explanatory ability
and 4) avoids multiple uses of the data in the Bayesian framework.
Bayesian Inference of the surviving number of motor
neurons for Motor Neuron Disease patients A.N. Pettitt, P.G. Ridall and Clare McGrory
This talk desribes the challenges of inference for the remaining number of motor
neurons for suffers of neurological diseases such as Motor Neuron Disease. In Ridall
et al (2006, 2007) we descibe a stochastic model for the firing of motor neurons in a
muscle of the leg or arm when the muscle nerve is subject to an electical stimulus.
For a series of increasing electrical stimulii to the nerve the neuro-musclular response
is measured as a series of electical currents, giving the so-called the response curve,
where the amplitude of the response current is the summation of the output currents
of units which are firing as a result of the stimulus. Units can fire probabilistically, or
always or never for a given input stimulus. The consequent response can be modelled
(in a simplified form) by a so-called mixture of mixtures given by the distribution
of Z1X1 + Z2X2 + . . . + ZNXNwhere the Z are independent Bernoulii random
variables with means depending on the applied stimulus and the X are independent
normal random variables with differing means not dependent on the stimulus. The
main focus is on inference for N, the unknown number of motor units or number of
components in the mixture of mixtures. In Ridall et al (2007) we used RJMCMC to
make inferences for N. This talk will consider approaches to improve the RJMCMC
algorithm, how the RJMCMC output from a sequence of studies on a patient can
be used to make inferences about the nature of the underlying mechanism of neuron
death and score between mechanisms, and how sequential Monte Carlo for static
problems can be utlised to estimate the value of the unknown N.
Particle Filtering and Learnings Nick Polson
Particle filtering and learning algorithms for general state space models are developed.
Our approach exactly samples from a particle approximation to the joint posterior distribution of both parameters and latent states.
We illustrate the effciency of our approach in a number of models: robust filtering including quantile models, t-errors,
Cauchy and Meridian errors; stochastic volatility jump diffusions. Robustness in both the observation and state equation is easily accommodated together with parameter learning. An application to stochastic volatility jump models for stock index return data is described.
Structured AR multi-processes for detecting cognitive
fatigue from multiple brain signals Raquel Prado
Mental fatigue is one of the main causes of human performance failures, leading
to accidents in vehicle operation, air traffic control and space missions. Therefore,
automatic detection of early signs of mental fatigue is key for increasing safety and
human performance in many scenarios.
Electroencephalograms (EEGs) are considered the most informative signals for monitoring mental fatigue among several other physiological and behavioral measures
available. We analyze multiple EEG signals recorded in subjects who performed
continuous mental arithmetic for a long period of time, which led to severe cognitive fatigue. In particular, we analyze the signals using a multi-process approach in
which each of the processes is an autoregression. We impose structured prior distributions that take into account the latent components underlying each autoregressive
process. These priors allow us to incorporate relevant information about the components that may characterize various mental states of alertness. We discuss issues
related to on-line filtering and automatic detection of fatigue from multi-channel
data.
A Bayesian nonparametric approach for analysing and
testing clustering structures Antonio Lijoi, Ramsés H. Mena, Igor Prünster and Stephen G. Walker
Many applications require a deep understanding of the clustering mechanism that
generates the observed data. The two parameter Poisson–Dirichlet process and
more general Gibbs–type priors are natural candidates for modelling data arising
from discrete distributions. Here we make use of such priors and analyze their posterior
behaviour in some detail. In particular, we propose methods for prediction
and testing in order to assess the clustering structure featured by the data. The
methodology is then applied to Expressed Sequence Tags (ESTs) data in genomics.
Indeed, when studying EST data one is typically interested in evaluating the redundancy
of the corresponding cDNA library and in comparing different libraries on
the basis of their ability to generate new distinct genes. Our proposal has appealing
properties over frequentist nonparametric methods, which become unstable when
prediction is required for large future samples.
Bayesian analysis of high dimensional data Sylvia Richardson, Leonardo Bottolo, Peter Green
In parallel to fast evolving technology that give rise to high dimensional data in
many set-ups, there is a lot of interest in searching for sparse structure in such
high dimensional data sets. In the first part of the talk, I shall discuss models and
algorithms based on parallel tempering and Evolutionary Monte Carlo for Bayesian
variable selection in the large p, small n paradigm. In the second part, I shall focus
attention on dimension reduction through sparse latent factor models. Models and
methods will be illustrated by examples from the field of genomics.
Non-parametric dynamic modelling of biological time
series Fabio Rigat
This talk illustrates the theory and application of a novel sequential non-parametric
method for estimating the dynamics of time series models. This method provides
a robust alternative to Bayesian state-space models which does not involve any
parametric assumption on the form of the evolution of a model’s parameters. Their
dynamics are assessed within a hypothesis testing framework as a change-point
problem. The Kullback-Leibler divergence between the posterior distributions of
different sets of data under the same model is proposed as a test statistic. Posterior
simulation is used to approximate the value of the KL divergence and its critical
region under the null hypothesis of no change.
The main motivation of this work is to estimate the molecular and functional dynamics of biological systems using high throughout technologies such as microarrays
and multi-electrode arrays. In this context, robust dynamic stochastic models are
fundamental tools because little is known about the mechanisms regulating the evolution of many biological processes. This talk focuses mainly on the estimation of
neuronal functional dynamics using multiple spike trains recorded in-vitro and in-vivo. The neuronal dynamics are shown to explain some aspects of a simple decision
process and the onset of movements.
Multivariate emulation of high-dimensional model
output Jonathan Rougier
“Emulation” is the statistical modelling of a complex deterministic function, usually
a computer code. When the code simulates a physical process, the outputs are often
high-dimensional, taking the form of collections of values of the same type (e.g.,
sea-surface temperatures indexed by space and time). Typically, a given collection
is smooth, and its components ought to be modelled jointly. However, assimilating
the large amount of data (the product of the number of model evaluations and the
number of outputs) is computationally challenging. A new approach, the “outer-
product emulator”, solves this problem. I describe the outer-product emulator, and
illustrate its use with a climate model.
Combining expert opinions in prior elicitation Judith Rousseau
We consider the problem of combining expert opinions in the process of eliciting a
prior distribution using these expert opinions. The idea is to consider a hierarchical
parametric modeling. Each expert has its own hyperparameter and the hyperparameters
are linked using a parametric model constructed using extra knowledge on
the experts and their opinions. Two examples are considered.
Discrete multivariate mixture distributions for spatial
models Alexandra M. Schmidt and Jennifer A. Hoeting
In this talk we propose models for multivariate count data which are spatially correlated and present over-dispersion. In other words, the model must capture the
covariance structure within and among locations. There are different ways, in the
literature, of defining a multivariate Poisson distribution. We discuss these different
approaches and consider two situations: a multivariate Poisson hierarchical model
with spatial random effects, and a multivariate negative binomial distribution, which
can also be defined in different ways, based on the multivariate Poisson. Inference
is performed under the Bayesian paradigm. As the posterior distribution does not
have a closed form, MCMC techniques are used to obtain a sample from the posterior. We discuss the properties of our proposed models on artificial data sets and
also on a real application.
Think locally, act globally:
Combining the best features of particle methods and MCMC Michael Johannes, Nicholas Polson, and Steven L. Scott
This talk describes an algorithm for simulating from an arbitrary probability distribution π by combining features from Markov chain Monte Carlo (MCMC) and particle based methods. MCMC explores the sample space of π by a series of local moves.
Particle methods simulate from π by first sampling many particles from a proposal
distribution, then resampling the particle with appropriately chosen weights. Particle methods are limited by the need to choose an appropriate proposal distribution,
but they have attractive global search properties that MCMC lacks (they have no
difficulty locating multiple modes, for example).
Our algorithm, which we call Particle Posterior Sampling, runs n Markov chains in
parallel. The state of each Markov chain at time t can be thought of as a particle, and
the collection of particles define an empirical distribution which can be resampled.
The result is an algorithm where particles move quickly to high probability regions,
then use MCMC’s local search capabilities to quickly explore those regions. The
resulting trade off is that if n is large then t can be kept small (e.g. single digits).
The worst-case behavior of the algorithm is that of n parallel Markov chains with a
common starting point near the mode.
We illustrate the algorithm on several canonical problems where MCMC is known
to struggle. These include carefully chosen examples from probit regression and
Gaussian linear models with “spike and slab” prior distributions, as well as finite
mixture models with an unknown number of components.
Improving the efficiency of likelihood-free computation Scott A. Sisson
In recent years there there has been considerable interest in Bayesian applications
where the likelihood function is computationally intractable. Most of these applications, and therefore the methods development, has occurred outside of mainstream
Statistics publications, primarily in population genetics and epidemiology.
In this presentation we outline the basic idea behind “likelihood-free” Bayesian
computation. Following this setup we demonstrate that while certain method specifications are arbitrary in theory (in that the correct model is realised asymptotically
regardless of their specification), in practice they can have an overwhelming influence on the efficiency of the computation. We propose simple methods to automate
model setup and improve the efficiency of its implementation, and illustrate them
through real analyses.
Multivariate GARCH Models with Correlation Clustering Mike K.P. So and Iris W.H. Yip
This paper proposes a clustered correlation multivariate GARCH model (CC-MGARCH)
which allows the conditional correlations to form clusters where each cluster follows
the same dynamic structure. One main feature of our model is to form a natural
grouping of the correlations among the series while generalizing the time-varying
correlation structure proposed by Tse and Tsui (2002). To estimate our proposed
model, we adopt Markov Chain Monte Carlo methods. Forecasts of volatility and
value at risk can be generated from the predictive distributions. The proposed
methodology is illustrated using simulated and financial market data.
Bayesian Computation, Non-Linear Dynamic Models &
Cellular Networks Mike West, Jarad Niemi, Lincgong You and Chee-Meng Tan
The development of effective methods of Bayesian computation for inference in non-linear dynamic models remains a challenge and one of growing importance in areas
such finance and systems biology. This talk will discuss advances in developing
and applying Metropolis methods in such problems, where inference involves high-dimensional latent states as well as hyper-parameters. Research in single cell systems
biology, where non-linear dynamic models of cellular networks arise and are copied
over thousands of cells, provide motivating examples and context.