Wednesday, 31 July 2019

TEMPORARY STRUCTURE OF INTEREST RATES: THE 2 REGRESSION MODELS

Eviction Notice: This is going to be the last article I will be posting in a long time. I am devoted to get a DATA SCIENTIST badge and a not related diplomatic distinction.... 


There are several algorithms to compute the Temporary structure of interest rates.  Nowadays, computational models are closer to reality than "mathemagic" models. Usual "mathemagic" devices include: bootstrapping, splines, Nelson-Siegel function and polynomial interpolation.

For this article, the polynomial regression and cubic regression will be used as an alternative to the minimalist bootstrapping method and as it is suggested by International Monetary Fund.

Any given market will exhibit more than one treasury or sovereign bond per maturity term i.e. 1, 2, 3, 4,5,6,7,8,9,10,15,20,30 year bonds

So thus, bootstrapping method will left out vital information , even if you consolidate similar maturities into a single hypothetical bond per maturity.

Polynomial interpolation works as a holistic model, by integrating the market as an interdependent unit rather than a "given term" dependent on the shorter maturities... For instance:

https://docs.google.com/spreadsheets/d/11RtCOu7TrJIgXCO7AtEEwFui8x6tNiOd/edit#gid=438735182

As you can read there, on the upper part , there is the market price of the bonds for 13 years of maturities.  Beside, you are getting the face value and outstanding coupons.  You must make up a temporary structure of interest rates for 13 years.  Bearing in mind there are several bonds for a particular year.

Yup!  First you must run a polynomial regression whereas :  y = the current market price and the values for x = coupons and face value adding zeros if nothing happens.

if you are too lazy to write a code on a programming language or your spreadsheet is not supporting polynomial regressions... You can resort to this:

https://www.wessa.net/rwasp_multipleregression.wasp

The first column of data is going to be your Y (the spot market price)  and the following columns will be reserved for the values of X

if everything run fine, you will get the raw gross rates per term.  You will need to re write them in effective annual and percentage notation .

This first proxy is not the end of it all.  A second regression is still needed, to smooth the curve:

https://docs.google.com/spreadsheets/d/11RtCOu7TrJIgXCO7AtEEwFui8x6tNiOd/edit#gid=1014033820

But, such a regression is a cubic regression.  If you spreadsheet is not running a cubic regression or you are still lazy to write a code on a programming language...Yet, you can still resort to this:

https://www.wessa.net/rwasp_multipleregression.wasp

Whereas, your Y = the rates you got from the previous regression and your X = time to maturity to the power of 1, 2 and 3

What is the idea?

To get the values of a cubic, four argument, equation.  The constant alpha and 3 intercepts....

Once you have got the 4 constant values of this cubic equation... You can proceed to smooth the curve by plugging the variables which they are , time to maturity, known  beforehand, already

The new values for Y are going to be the final values , for your structure of interest rates....


WARNING:  in the same vein of bootstrapping method, polynomial interpolation fails to account for the negotiated volumes of each "treasury" .... In the case studied, there is an alleged equality in the market transactions of each "sovereign bond" or "treasury"


See You Soon!

Sources & Acknowledgements:  
International Monetary Fund & Alex Ho, PhD      https://www.edx.org/bio/alex-ho
Patrick Wessa, PhD     




Friday, 11 January 2019

PCA CORRELATION: A BRIEF INTRODUCTION TO DIMENSION REDUCTION (A Brute Force Mean Variance Standardization)

DISCLAIMER:  This blog's author is not a stakeholder of Chevron Texaco, nor this article is a formal and staid financial advice.

Have you ever wondered why covariance & correlation are so numerically different?   Can they converge?  On the following lines I will be explaining a crucial part of PRINCIPAL COMPONENT ANALYSIS: an early tool for dimension reduction and data noise removal.  The tool has more than 100 years, so,  in terms of today standards is obsolete and plenty of caveats.

The below URL displays prices for the CHEVRON-TEXACO share, high and low prices were taken into account against TEXAS WTI reference price, all of them in a spreadsheet:

https://1drv.ms/x/s!ApxRazJ7xJyUf9ZPugnE4Jenbuc


click on: CVX tab, there you will get amounts of raw data...What is the trick? 

1. I got continuously compounded returns, their average and their standard deviation

2. MEAN VARIANCE STANDARDIZATION ... Of those values, by resorting to this formula:             Z = (x - x̄) / s

3. Once you get your Z statistics, out of the returns; bear in mind that, the mean for a standard normal distribution = 0 and its standard deviation = 1
.  So thus, COVARIANCE & CORRELATION values are the same for 2 data sets.

Now, click on: COVARIANCE-CORRELATION  tab, you will get products of 2 data sets: high & low returns, high returns & Texas WTI price change, low returns & Texas WTI price change.


2 dimensions are now 1 for numerical purposes.  For the particular case of CHEVRON TEXACO, There is strong positive correlation for the variations involving the highest and lowest intraday price returns...While the relationship of such returns is moderately positive against the changes on the Texas WTI reference price...

The company is, perhaps, not hugely affected by changes in the price of CRUDE OIL due to investment diversification...

CAVEAT:   By turning all of the values from your data sets into a standard normal distribution , we are assuming PERFECT NORMALITY .   No values falling out of the tails or black swans are part of the ensemble or they are assumed as PERFECTLY NORMAL.

To solve the above problem, Variation Autoencoders from the field of Machine/Deep Learning are a more accurate exit.  Mathematical Models are quite inferior to Data Models in terms of today technology.

Sources & Acknowledgements:  
Shingai Manjengwa from Fireside Analytics Inc.
Mikhail Lakirovich, Greg Filla, Armand Ruiz &  Saeed Aghabozorgi from IBM


https://www.tastytrade.com/tt/learn/correlation