Applications of Geometry in Optimization and Statistical Estimation
Loading...
Date
2016-01-25
Authors
Maroufy, Vahed
Advisor
Marriott, Paul
Journal Title
Journal ISSN
Volume Title
Publisher
University of Waterloo
Abstract
Geometric properties of statistical models and their influence on statistical inference and
asymptotic theory reveal the profound relationship between geometry and statistics.
This thesis studies applications of convex and differential geometry to statistical inference, optimization
and modelling. We, particularly, investigate how geometric understanding assists statisticians in dealing with non-standard inferential
problems by developing novel theory and designing efficient computational algorithms. The thesis is
organized in six chapters as it follows.
Chapter 1 provides an abstract overview to a wide range of
geometric tools, including affine, convex and differential geometry. It also provides the reader with a short
literature review on the applications of geometry in statistical inference and exposes the geometric structure
of commonly used statistical models. The contributions of this thesis are organized in the following four chapters,
each of which is the focus of a submitted paper which is either accepted or under revision.
Chapter 2 introduces a new parametrization to general family of
mixture models of the exponential family. Despite the flexibility and popularity of mixture models, their associated
parameter spaces are often difficult to represent due to fundamental identification problems. Other related
problems include the difficulty of estimating the number of components, possible unboundedness and non-concavity
of the log-likelihood function, non-finite Fisher information, and boundary problems giving rise to non-standard
analysis. For instance, the order of a finite mixture is not well defined and often can not be estimated from a finite
sample when components are not well separated, or some are not observed in the sample.
We introduce a novel family of models, called the discrete mixture of local mixture models, which reparametrizes the space
of general mixtures of the exponential family, in a way that the parameters are identifiable, interpretable, and,
due to a tractable geometric structure, the space allows fast computational algorithms. This family
also gives a well-defined characterization to the number of components problem. The component densities are
flexible enough for fitting mixture models with unidentifiable components, and our proposed algorithm only includes
the components for which there is enough information in the sample.
Chapter 3 uses geometric concepts to characterize the parameter
space of local mixture models (LMM),
introduced in \cite{Marriott2002} as a local approximation to continuous mixture models. Although LMMs are shown
to satisfy nice inferential properties, their parameter space is restricted by two types of boundaries, called
the hard boundary and the soft boundary. The hard boundary guarantees that an LMM is a density function, while the
soft boundary ensures that it behaves locally in a similar way to a mixture model. The boundaries are shown to have particular
geometric structures that can be characterized by geometry of polytopes, ruled surface and developable surfaces. As
working examples the LMM of a normal model and the LMM of a Poisson distribution are considered. The boundaries described
in this chapter have both discrete aspects, (i.e. the ability to be approximated by polytopes), and smooth aspects (i.e.
regions where the boundaries are exactly or approximately smooth).
Chapter 4 uses the model space introduced in Chapter 2
for extending a prior model and defining a perturbation space in the Bayesian sensitivity analysis.
This perturbation space is well-defined, tractable, and consistent with the elicited prior knowledge, the three
properties that improve the methodology in \cite{Gustafson1996}.
We study both local and global sensitivity in conjugate Bayesian models. In the local analysis the worst direction
of sensitivity is obtained by maximizing the directional derivative of a functional between the perturbation
space and the space of posterior expectations. For finding the maximum global sensitivity, however, two criteria
are used; the divergence between posterior predictive distributions and the difference between posterior expectations.
Both local and global analyses lead to optimization problems with a smooth boundary restriction.
Chapter 5 studies Cox's proportional hazard model with an unobserved frailty for
which no specific distribution is assumed. The likelihood function, which has a mixture structure with an
unknown mixing distribution, is approximated by the model introduced in Chapter 2, which is always identifiable and estimable. The nuisance parameters in the approximating
model, which represent the frailty distribution through its moments, lie in a convex space with a
smooth boundary, characterized as a smooth manifold. Using differential geometric tools, a new algorithm
is proposed for maximizing the likelihood function restricted by the smooth yet non-trivial boundary. The
regression coefficients, the parameters of interest, are estimated in a two step
optimization process, unlike the existed methodology in \cite{Klein1992} which assumes a gamma assumption
and uses Expectation-Maximization approach.
Simulation studies and data examples are also included, illustrating that the new methodology is promising
as it returns small estimation bias; however, it produces larger standard deviation compared to the EM method.
The larger standard deviation can be the result of using no information about the shape of the frailty model,
while the EM model assumes the gamma model in advance; however, there are still ways to improve this methodology.
Also, the simulation section and data analysis in this chapter is rather incomplete and more work needs to be done.
Chapter 6 outlines a few topics as future directions and possible extensions
to the methodologies developed in this thesis.
Description
Keywords
Convex and Differential Geometry, Mixture models, Local Mixture Models, Frailty survival models, Bayesian robustness, Computing Boundaries