Saturday, 4 May 2013

A MULTIPLE CORRELATION INDEX USED AS AN EMPIRICAL COPULA: PEARSON PRODUCT REVISITED (PART 1)

The idea for the following essay was born out of the same question: "heh, heh, heh, what is the joint correlation index when one has 3 or more time series to compare? "
 
The answer given by mathematicians and other "autistic quants" leads to a "correlation matrix"  and thus a formula used by econometrists:
 
R2 = c' * Rxx1 * c where: c' is the transpose vector of correlations involving the dependent variable and the independent variables, Rxx1 is the inverse matrix of the correlations among the independent variables only and c is the vector of correlations involving the dependent variable and the independent variables.
So far, so good. The main pitfall and flaw behind this formula lays on the fact that at least one of the "series" or "variables" to be analyzed has to be dependent. Furthermore, the formula is not conmutative.
But what if we want to deploy a conmutative formula assuming total independence about the birth of the n series targeted to study... The solution is not far away. The Pearson product has the answer...
The original formula was thought for 2 series, it reads:
Covariance (ab)/(σa * σb) = ρ Even if we want a Pearson product for a lonesome series, the formula holds on tight by respecting its attribute...
 
Covariance (aa)/(σa * σa) = ρ Both the upper factor and the lower factor are always SECOND DEGREE.
 
if we have 3 series:
 
Covariance (abc) = Σ{ (ai – A)*(bi – B)*(ci – C)}^(2/3)
                               --------------------------------------------------
                                                              n -1
 
if we have k series:
 
Covariance (abc...k) = Σ {(ai – A)*(bi – B)*(ci – C)*...*(ki – K)}^(2/k)
                                    -------------------------------------------------------------
                                                                     n -1
 
where ai, bi ci and ki are data at the same time, A, B, C and K are the averages of each series, k is the number of series and n is the number of summands.
 
The trick behind this algorithm is to keep the accurate “+” or “-” after the conversion of every subproduct. Noteworthy, the conversion has to be deployed on the absolute values...And the coupled correlations have to fall on the same grounds (i.e. Strong direct relation, Strong inverse relation or strong independent relation)
 
So therefore, the correlation for both examples would remain:
ρ(a,b,c) = Covariance (abc)
             -------------------------
                (σa*σb*σc)^2/3
 
ρ(a,b,c...k) = Covariance (abc...k)
                 ---------------------------
                 (σa*σb*σc*...σk)^2/k
 
Part 2 of this essay will render nummerical examples based upon all of the above. An alternate approach would be a fictional sample or time series built by 1 unit out of the independent variables.
 
Sources: RiskCenter, Investopedia, Wikipedia, EdX.org, Coursera.org

No comments:

Post a Comment