Skip to content

Commit

Permalink
Updated README
Browse files Browse the repository at this point in the history
  • Loading branch information
thomaskeck committed May 17, 2017
1 parent 23762ed commit 90c95b2
Showing 1 changed file with 7 additions and 12 deletions.
19 changes: 7 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
# FastBDT

Stochastic Gradient Boosted Decision Trees, usable standalone, as TMVA Plugin and via Python Interface.
Stochastic Gradient Boosted Decision Trees, usable standalone, and via Python Interface.

# Paper on ArXiv: http://arxiv.org/abs/1609.06119

FastBDT: A speed-optimized and cache-friendly implementation of stochastic gradient-boosted decision trees for multivariate classification

Stochastic gradient-boosted decision trees are widely employed for multivariate classification and regression tasks. This paper presents a speed-optimized and cache-friendly implementation for multivariate classification called FastBDT. FastBDT is one order of magnitude faster during the fitting-phase and application-phase, in comparison with popular implementations in software frameworks like TMVA, scikit-learn and XGBoost. The concepts used to optimize the execution time and performance studies are discussed in detail in this paper. The key ideas include: An equal-frequency binning on the input data, which allows replacing expensive floating-point with integer operations, while at the same time increasing the quality of the classification; a cache-friendly linear access pattern to the input data, in contrast to usual implementations, which exhibit a random access pattern. FastBDT provides interfaces to C/C++, Python and TMVA. It is extensively used in the field of high energy physics by the Belle II experiment.
Stochastic gradient-boosted decision trees are widely employed for multivariate classification and regression tasks. This paper presents a speed-optimized and cache-friendly implementation for multivariate classification called FastBDT. FastBDT is one order of magnitude faster during the fitting-phase and application-phase, in comparison with popular implementations in software frameworks like TMVA, scikit-learn and XGBoost. The concepts used to optimize the execution time and performance studies are discussed in detail in this paper. The key ideas include: An equal-frequency binning on the input data, which allows replacing expensive floating-point with integer operations, while at the same time increasing the quality of the classification; a cache-friendly linear access pattern to the input data, in contrast to usual implementations, which exhibit a random access pattern. FastBDT provides interfaces to C/C++ and Python. It is extensively used in the field of high energy physics by the Belle II experiment.


# Installation
Expand All @@ -15,26 +15,18 @@ Stochastic gradient-boosted decision trees are widely employed for multivariate
* make
* make install
* make package (optional to build rpm, deb packages)
* python3 setup.py install (optional to install the python package)


# Usage

Before you do anything you want to execute the unittests:
* ./unittest


There are multiple ways to use FastBDT.
I prepared some standalone executables
* ./FastBDTMain
* or python3 tools/FastBDTMain.py

for the python3 version you probably have to set the LD_LIBRARY_PATH


But usually it should be more convinient to use FastBDT as a library
and integrate FastBDT directly into your application using
* the C++ shared/static library (see example/CPPExample.cxx),
* the C shared library,
* the TMVA plugin if you're using ROOT (see root examples/TMVAExample.C ),
* or the Python3 library python/FastBDT.py (see example/PythonExample.py ).


Expand All @@ -43,3 +35,6 @@ This work is mostly based on the papers by Jerome H. Friedman
* https://statweb.stanford.edu/~jhf/ftp/trebst.pdf
* https://statweb.stanford.edu/~jhf/ftp/stobst.pdf

FastBDT also implements the uGB techniques to boost to flatness:
* https://arxiv.org/abs/1410.4140

0 comments on commit 90c95b2

Please sign in to comment.