Rank Estimation in Linear Mixture Models
or
Detecting the Number of Signals
or
Estimating the Number of Components
Written by Shira Kritchman & Boaz Nadler, 2008
We are given n independent identically distributed high dimensional observations, i=1, …, n, from the following linear mixture model
|
(1) |
where ui are random variables, vi are fixed unknown vectors in Rp, is the (unknown) noise level and is a multivariate N(0, I) Gaussian random variable with identity covariance matrix. The basic assumption is that the covariance matrix of the random variables ui is full rank, and that the vectors are linearly independent, so that indeed there are K identifiable components.
In the signal array processing literature, the model (1) is typically written in matrix form as
where the matrix contains the p vectors and is typically called the steering matrix, whereas the vector contains the K random signals.
The problem is to estimate the number of factors/components/sources/signals K from the given the n observations , i=1…,n. This problem occurs in a variety of scientific fields, including signal processing, statistics, and chemometrics (see [3,4,5,6,7,8] and many references therein). Determining the number of sources is typically the first step prior to other computationally intensive parametric procedures.
Here we consider estimation of the number of components in the non-parametric setting where no prior information is assumed on the vectors, beyond the fact that they are linearly independent. Under this setting, to estimate K we use only the eigenvalues of the sample covariance matrix
.
Non-Parametric Detection Limit:
As , and assuming the random variables ui have finite fourth moments, the sample covariance matrix converges with probability one to the population covariance matrix. Hence,
where are the signal eigenvalues, and .Thus, as , any signal, however small, can eventually be detected. However, for finite n, the noise eigenvalues are not all equal to but may rather have a significant spread. As shown by various authors, the largest eigenvalue of a pure noise matrix is of the order of . This puts a threshold on the signal strengths identifiable (with high probability), using the sample eigenvalues .
Specifically, we have that [9,10,11]
Theorem: In the joint limit , with , a signal is identifiable by the largest sample
eigenvalue if and only if
.
Hence, any algorithm to detect the number of signals should hopefully have a detection performance close to this threshold.
We propose an estimator for the number of sources based on a sequence of hypothesis tests. For k=1,…,min(p,n)-1 we test
To reject H0, hence accepting the alternative hypothesis H1, the k-th sample eigenvalue must be significantly larger than the quantity expected under H0, where the k-th population eigenvalue is equal to. According to [2], the largest eigenvalue of the covariance matrix of n pure noise observations of length p is asymptotically distributed according to a Tracy-Widom distribution
where for real-valued / complex valued observations, respectively, and explicit expressions for can be found in [1,12].
Hence, our estimator is of the form
where is an estimator of the unknown noise level, is a significance level chosen by the user, and is the resulting threshold obtained by inversion of the Tracy-Widom distribution.
This test is closely related to
One key point in using this approach is that for good detection performance at finite sample sizes, and specifically when n<<p, there is a need for an accurate estimate of the unknown noise level. This is discussed in detail in [1].
The main function that estimates the rank and the noise level is
function [k_hat sigma2_hat ] = KN_rankEst(ell,
n, beta, alpha, max_k)
where
ell = sample eigenvalues, sorted in decreasing order !
n = number of samples,
beta = 1/2 for real/complex valued noise,
alpha = confidence level in percentage points, 0.1 is a good value.
max_k = maximum number of signals
Additional required routines are:
function [mu_np sigma_np] = KN_mu_sigma(n,p,beta); %
computes
function s_Wishart
= KN_s_Wishart(alpha,beta) % computes approximate threshold
function sig_hat_kk = KN_noiseEst(ell,n,kk) %
estimate of noise level, assuming kk signals
All these functions can
be downloaded here: KN_rank_estimation.zip
Demo: function demo_n
produces a performance curve as a function of number of samples n,
similar to the one in the paper.
[1] S. Kritchman and B. Nadler, Determining the number of components in a factor model from limited noisy data, Chem. Int. Lab. Sys. 2008.
[2] I.M. Johnstone, On the distribution of the largest eigenvalues in principal components analysis, Annals of Statistics, vol. 29, pp. 295-327, 2001.
[3] M. Wax and T. Kailath, Detection of signals by information theoretic criteria, IEEE Trans. Sig. Proc., vol.33, pp. 387-392, 1985.
[4] L.A. Goodman and S.J. Haberman, The analysis of nonadditivity in two-way analysis of variance, J. Am. Stat. Assoc., vol. 85, pp. 139-145, 1990.
[5] J.R. Schott, A note on the critical values used in stepwise tests for multiplicative components of interaction, Comm. Stat. Th. Methods, vol. 15, pp. 1561-1570, 1986.
[6] E.R. Malinowski, Determination of the number of factors and the experimental error in a data matrix, Analytical Chemistry, vol. 49, pp. 612-617, 1977.
[7] K. Faber and B.R. Kowalski, Modification of Malinowski's F-test for abstract factor analysis, Journal of Chemometrics, vol. 11, pp. 53-72, 1997.
[8] M. Meloun, J. Capek, P. Miksik, and R.G. Brereton, Critical comparison of methods predicting the number of components in spectroscopic data. Analytica Chimica Acta, vol. 423, pp. 51-68, 2000.
[9] J. Baik and J.W. Silverstein, Eigenvalues of large sample covariance matrices of spiked population models. J. of Mult. Anal., 97(6):1382-1408, 2006.
[10] D. Paul, Asymptotics of sample eigenstructure for a large dimensional spiked covariance model. Statistica Sinica, 17:1617{1642, 2007.
[11] B. Nadler, Finite sample approximation results for principal component analysis: A matrix perturbation approach. to appear, Annals of Statistics, 2008.
[12] N. El-Karoui, A rate of convergence result for the largest eigenvalue of complex white Wishart matrices, Annals of Probability, 34 (6):2077–2117, 2006.