This tool was developed to support a research project investigating the application of machine learning algorithms to predict credit spreads in the corporate bond market. It implements a Neural Networks models via TensorFlow as well as a Gradient Boosted Regression Trees model via XGBoost. It also allows for model exploration and prediction interpretation by implementing the Eli5 and LIME Libraries. The tool creates the ability to parse the raw research data from unstructured Excel Files into a SQLite database so that it can be transformed into a functional machine learning data set. This tool also provides some data analysis ability to ensure data quality was at a sufficient level.
For more information on the research, refer to the research document file available here: Estimating Credit Risk Premiums via Gradient Boosted Regression Trees and Neural Networks
run python.exe interface.py at the command line to launch the GUI or run interface.py in your Python IDE.
IMPORTANT: When running for the first time you must build the valuation curve database and calculate the z-spreads. This can be performed by running the following steps from the GUI Menu:
- Build the valuation curve database: Menu -> Database -> Build Valuation Curve DB
- Calculate the z-spreads: Menu -> Database -> Calculate Z-Spread Analytics (must be performed in order specified)
Author: Rene Alby