This is a library intended for pedagogical purposes illustrating a very minimal implementation of dynamic computational graphs with reverse-mode differentiation (backpropagation) for computing gradients. Three guidelines motivate design choices made in the implementation:
- Mimicking PyTorch's API as closely as possible.
- Simple
forward
/backward
for operations (operating on numpy arrays). - Dynamic computation graphs, built as operations are run.
The library has been inspired by several other similar projects. Specific acknowledgments are in the source where appropriate.
micrograd
by Karpathyautodidact
: a pedagogical implementation ofautograd
joelnet
In examples/toy_half_sum
, you will find a basic use case. main.py
exhibits a basic use case of defining a feed-forward neural network (multi-layer perceptron) to learn a basic function (in this case, y = sum(x)/2
where x
is a binary vector). You can run it by using python main.py
from an environment with the packages from requirements.txt
.
There are a few important data structures:
Tensor
: this is a wrapper around a numpy array (stored in.value
), which corresponds to a node in a computation graph, storing information like its parents (if any) and a backward method.Operator
: an operator implements theforward
/backward
API and operates directly on numpy arrays. A decorator@tensor_op
converts anOperator
into a method that can be directly called onTensor
arguments, which will build the graph dynamically.nn.Module
: as in PyTorch, these are wrappers for graphs that keep track of parameters, sub-modules, etc.