This tutorial serves as a deep look at the simplest of neural networks, the multilayer perceptron. The manuscript notebook uses no dependencies other than Numpy to build up a neural network, train it to approximate angle-dependent seismic reflectivity, and then predict values inside (and outside) the domain of interest.
There's a code block which uses tqdm to render a progress bar to screen during training. To install tqdm:
pip install tqdm
And to plot the results you'll need matplotlib:
pip install matplotlib
To run the load_and_process_data.ipynb you will need welly and bruges...
pip install welly bruges
...and scikit-learn:
pip install scikit-learn
It's a good idea to make a virtual environment for your projects. You can easily do this with conda
:
conda env create -f environment.yml
In the published version of the article, the derivative of the activation function (i.e., the logistic function sigma(z) = 1 / (1+exp(-z)) ) was expressed as sigma'(z) = z(1-z). Using a particular value makes it clear that this expression of the derivative is wrong (z=0; z(1-z)[0]=0 but the tangent of the sigmoid function is not horizontal on z=0).
One can show that sigma'(z) = -exp(-x)/(1+exp(-x))^2 = sigma(z)*(1-sigma(z)). A more detailed demonstration can be found there: (https://en.wikipedia.org/wiki/Logistic_function#Derivative). The expression of the derivative should be corrected from z*(1-z) to sigma(z)*(1-sigma(z) in the second equation of the paper.
However, the python expression of the backward sigmoid function (x*(1-x)) makes it possible to compute the forward and the backward values while computing the exponential value only once. This uses the following composition:
- a1 = sigma(z) (estimate exonential)
- derivative = sigma(a1,False) = sigma(a1)*(1-sigma(a1)) = sigma'(a1).
The code provided in the paper is thus correct. It is probably faster than an implementation that would compute independently the sigmoid function and it's derivative.