Explanations

Angular data arises in many scientific fields, such as in experimental biology for the study of animal orientation, and in bioinformatics in relation to the protein structure prediction problem.

The statistical analysis of this data requires adapted tools such as $2\pi$-periodic density models. Fernandez-Duran (Biometrics, 60(2), 2004) proposed non-negative trigonometric sums (i.e. non-negative trigonometric polynomials) as a flexible family of circular distributions. However, the coefficients of trigonometric polynomials expressed in the standard basis $1, \cos(x), \sin(x), \dots$ are difficult to interpret and we do not see how an informative prior could be specified through this parametrization. Moreover, the use of this basis was criticized by Ferreira et al. (Bayesian Analysis, 3(2), 2008) as resulting in a “wigly approximation, unlikely to be useful in most real applications”.

Trigonometric density basis

Here, we suggest the use of a density basis of the trigonometric polynomials and argue it is well suited to statistical applications. In particular, coefficients of trigonometric densities expressed in this basis possess an intuitive geometric interpretation. Furthermore, we show how “wiggliness” can be precisely controlled using this basis and how another geometric constraint, periodic unimodality, can be enforced [first proposition on the poster]. To ensure that nothing is lost by using this basis, we also show that the whole model consists of precisely all positive trigonometric densities, together with the basis functions [first theorem on the poster].

Prior specification

Priors can be specified on the coefficients of mixtures in our basis and on the degree of the trigonometric polynomials to be used. Through the interpretability of the coefficients and the shape-preserving properties of the basis, different types of prior knowledge may be incorporated. Together with an approximate understanding of mass allocation, these include:

• periodic unimodality;
• bounds on total variation; and
• knowledge of the marginal distributions (in the multivariate case).

The priors obtained this way are part of a well-studied family called sieve priors, including the well-known Bernstein-Dirichlet prior, and are finite mixtures with an unknown number of components. Most results and interpretations about the Bernstein-Dirichlet prior (see Petrone & Wasserman (J. R. Stat. Soc. B., 64(1),  2002), Kruijer and Van der Vaart (J. Stat. Plan. Inference, 138(7), 2008), McVinish et al. (Scand. J. Statist., 36(2), 2009) can carry over to the priors we consider, but we dot not discuss them further.

Approximation-theoric framework

Our density models arise as the image of “shape-perserving” linear approximation operators. This approximation-theoric relationship is used to obtain a notably large prior Kullback-Leibler support and ensures strong posterior consistency at all bounded (not necessarily continuous) density. The result partly relies on known properties of sieve priors, as well as general consistency results (Walker (Ann. Statist., 32(5), 2004)), but extends known result by removing an usual continuity hypothesis on the densities at which consistency is achieved (see Wu & Ghosal (‎Electron. J. Stat., 2, 2008), Petrone & Veronese (Statistica Sinica, 20, 2010)). For contraction rates, higher order smoothness conditions are usually required (see Shen & Ghosal (Scand. J. Statist., 42(4), 2015)).

For example, consider the prior induced by the random density

$T_n \mathcal{D} := \sum_i \mathcal{D}(R_{i,n}) C_{i,n},\qquad (1)$

where $\mathcal{D}$ is a Dirichlet process, $n$ is distributed on $\mathbb{N}$ and $R_{i,n}$ is a partition of the circle. It has the strong posterior consistency at all bounded density provided that the associated operator

$T_n : f \mapsto \sum_i C_{i,n} \int_{R_{i,n}} f$

is such that $\|T_n f - f\|_\infty \rightarrow 0$ for all continuous $f$.

More generally, let $\mathbb{F}$ be a set of bounded densities on some compact metric space $\mathbb{M}$, let $T_n : L^1(\mathbb{M}) \rightarrow L^1(\mathbb{M})$, $n \in \mathbb{N}$, be a sequence of operators that are:

• shape preserving: $T_n$ maps densities to densities and $T_n(\mathbb{F}) \subset \mathbb{F}$; and
• approximating: $\|T_n f - f\|_\infty \rightarrow 0$ for all continuous $f$;

and finally let $\Pi_n$ be priors on $T_n(\mathbb{F})$ with full support. A sieve prior on $\mathbb{F}$ is defined by

$\Pi : A \mapsto \sum_n \rho(n) \Pi_n(A \cap T_n(\mathbb{F}))$.

Theorem.
If $0 < \rho(n) < Ce^{-c d_n}$ for some increasing sequence $d_n$ bounding the dimensions of $T_n (\mathbb{F})$, then the posterior distribution of $\Pi$ is strongly consistent at each density of $\mathbb{F}$.

The approximation theory literature is rich in such operators. The theorem shows that they provide strongly consistent priors on arbitrary density spaces simply given priors $\Pi_n$ on $T_n(\mathbb{F})$.

Basic density estimation:

A thousand samples (grey histogram) were drawn from the density in orange. The prior is defined by (1) with the Dirichlet process centered on the uniform density and with a precision parameter of 2. The degree $n$ is distributed as a $\text{Poiss}(15)$. The blue line is the posterior mean, the dark blue shaded region is a 50% pointwise credible region around the median, and the light blue shaded region is a 90% credible region.

One thought on “Constrained semiparametric modelling (for directional statistics)”

1. […] printed my poster today (see this post) for the 11th Bayesian nonparametrics conference. Here’s the final version (42in x […]