Explain a procedure for reducing the noise and enhancing the signal included in an empirical covariance matrix.
Marcenko–Pastur
Signal/Noise Ratio
MV Portfolio
M SR Portfolio
Targeted Shrinkage
S.Alireza Mousavizade
Sat Mar 05 2022
Detonation & Denoising
Inspiration
Covariance matrices are commonly used in finance. We utilize them to conduct regressions, evaluate risks, optimize portfolios, run Monte Carlo simulations, discover clusters, reduce the dimensionality of a vector space, and so on. Empirical covariance matrices are calculated using a sequence of observations from a random vector to estimate the linear comovement between the random variables that comprise the random vector. Because these observations are limited and nondeterministic, the estimated covariance matrix contains some noise. Empirical covariance matrices constructed from estimated factors are similarly numerically ill-conditioned, as the estimated factors are also based on incorrect data. Unless we address this noise, it will influence the covariance matrix computations, perhaps rendering the study meaningless.
Here, we attempt to describe a method for minimizing noise and boosting signals in an empirical covariance matrix. We will presume that empirical covariance and correlation matrices have been treated to this approach throughout this Element.
The Marcenko-Pastur Theorem is a mathematical theorem.
Consider a matrix X of independent and identically distributed random observations with a size of TxN and an underlying process with a mean of zero and a variance of σ2. The matrix C=T−1X′X has eigenvalues λ that asymptotically converge (as N→+∞ and T→+∞ with 1<TN<+∞ ) to the Marcenko-Pastur probability density function (PDF),
f[λ]={NT2πλσ2(λ+−λ)(λ−λ−)0 if λ∈[λ−,λ+] if λ∈/[λ−,λ+]
where λ+=σ2(1+N/T)2 is the greatest predicted eigenvalue and λ−=σ2(1−N/T)2 is the least expected eigenvalue. When σ2=1, the correlation matrix associated with X is C. The Marcenko-Pastur PDF is implemented in Python by Code Snippet 2.1.
Eigenvalues λ∈[λ−,λ+]are consistent with random behavior, and eigenvalues λ∈/[λ−,λ+]are consistent with nonrandom behavior. Specifically, we associate eigenvalues λ∈[0,λ+]with noise. Figure 2.1 and Code Snippet 2.2 demonstrate how closely the Marcenko-Pastur distribution explains the eigenvalues of a random matrix X.
Not all eigenvectors in an empirical correlation matrix are necessarily random. Because Code Snippet 2 generates a covariance matrix that is not totally random, its eigenvalues will only approximate the Marcenko-Pastur PDF. Only numberFactors have some signal among the numberColumns random variables that comprise the covariance matrix formed by randomCov. To further dilute the signal, we combine it with a random matrix with an alpha weight.
Marcenko-Pastur Distribution Fitting
In this case, we use the technique proposed by Laloux et al (2000). Because random eigenvectors only account for a portion of the variance, we may alter σ2 in the previous equations accordingly. For example, if we assume that the eigenvector associated with the greatest eigenvalue is not random, we should substitute σ2 in the earlier equations with σ2(1−λ+/N). In reality, we may get the implied σ2 by fitting the function f[λ] to the empirical distribution of eigenvalues. This yields the variance explained by the random eigenvectors in the correlation matrix, as well as the cutoff level λ+, adjusted for nonrandom eigenvectors.
Snippet 3 applies the Marcenko-Pastur PDF on a random covariance matrix with the signal. The fit aims to identify the σ2 value that minimizes the sum of squared discrepancies between the analytical PDF and the kernel density estimate (KDE) of the observed eigenvalues (for references on KDE, see Rosenblatt 1956; Parzen 1962). The value λ+ is reported as eMax0, σ2 is saved as var0, and the number of factors is retrieved as numberFactors0.
SNIPPET 2 TO A RANDOM COVARIANCE MATRIX ADD SIGNAL
_23
function randomCov(
_23
numberColumns, # number of columns
_23
numberFactors # number of factors
_23
)
_23
data = rand(Normal(), numberColumns, numberFactors) # random data
_23
covData = data*data' # covariance of data
_23
covData += Diagonal(rand(Uniform(), numberColumns)) # add noise to the matrix
Fitting the Marcenko-Pastur PDF to a noisy covariance matrix (Figure 2).
Figure 2 depicts the eigenvalue histogram and PDF of the fitted Marcenko-Pastur distribution. Eigenvalues to the right of the fitted Marcenko-Pastur distribution cannot be linked with noise. Hence they must be connected to the signal. The code returns 100 for numberFactors0, the same number of factors we injected into the covariance matrix. Despite a weak signal in the covariance matrix, the technique was able to distinguish the eigenvalues associated with noise from the eigenvalues associated with the signal. The fitted distribution predicts σ2≈.6768, implying that the signal accounts for just approximately 32.32% of the variance. This is one method of calculating the signal-to-noise ratio in financial data sets, which is well known to be poor due to arbitrage effects.
Denoise
Shrinking a numerically ill-conditioned covariance matrix is popular in financial applications (Ledoit and Wolf 2004). Shrinkage minimizes the condition number of the covariance matrix by bringing it closer to a diagonal. Shrinkage, however, does this without distinguishing between noise and signal. As a result, shrinkage might amplify an already weak signal. In the previous section, we learned how to differentiate between eigenvalues associated with noise components and eigenvalues related to signal components. This section will look at how to use this information to denoise the correlation matrix.
Method of Constant Residual Eigenvalues
This approach consists in setting a constant eigenvalue for all random eigenvectors. Let {λn}n=1,…,N be the set of all eigenvalues, ordered descending, and i be the position of the eigenvalue such that λi>λ+and λi+1≤λ+. Then we set λj=1/(N−i)∑k=i+1Nλk,j=i+1,…,N, hence preserving the trace of the correlation matrix. Given the eigenvector decomposition VW=WΛ, we form the denoised correlation matrix C1 as
The apostrophe (') transposes a matrix, and diag[.] zeroes all non-diagonal elements of a squared matrix. The second transformation is used to rescale the matrix C1 so that the major diagonal of C1 is an array of 1s. This technique is implemented by Code Snippet 4. Figure 3 compares the logarithms of the eigenvalues before and after this denoising approach.
DENOISING SNIPPET 4 BY CONSTANT RESIDUAL EIGENVALUE
Figure 3 compares eigenvalues before and after the residual eigenvalue approach was applied.
Shrinkage on Demand
Because it reduces noise while keeping the signal, the numerical technique described previously is better for shrinkage. Alternatively, we might limit the shrinkage application to random eigenvectors. Take a look at the correlation matrix. C1
C1=WLΛLWL′+αWRΛRWR′+(1−α)diag[WRΛRWR′]
where WR and ΛR are the eigenvectors and eigenvalues associated with {n∣λn≤λ+},WL and ΛL are the eigenvectors and eigenvalues associated with {n∣λn>λ+}, and α regulates the amount of shrinkage among the eigenvectors and eigenvalues associated with noise ( α→0 for total shrinkage). This technique is implemented by Code Snippet 5. Figure 4 compares the logarithms of the eigenvalues before and after this denoising approach.
Figure 4 compares eigenvalues before and after the targeted shrinking strategy was used.