v0.10.0
Release v0.10.0
Please pull the official images from docker hub.
We are glad to release version 0.10.0. In this version, we are happy to release the new
Python API.
-
Our old Python API is kind of out of date. It's hard to learn and hard to
use. To write a PaddlePaddle program using the old API, we'd have to write
at least two Python files: onedata providerand another one that defines
the network topology. Users start a PaddlePaddle job by running the
paddle_trainerC++ program, which calls Python interpreter to run the
network topology configuration script and then start the training loop,
which iteratively calls the data provider function to load minibatches.
This prevents us from writing a Python program in a modern way, e.g., in the
Jupyter Notebook. -
The new API, which we often refer to as the v2 API, allows us to write
much shorter Python programs to define the network and the data in a single
.py file. Also, this program can run in Jupyter Notebook, since the entry
point is in Python program and PaddlePaddle runs as a shared library loaded
and invoked by this Python program.
Basing on the new API, we delivered an online interative book, Deep Learning 101
and its Chinese version.
We also worked on updating our online documentation to describe the new API.
But this is an ongoing work. We will release more documentation improvements
in the next version.
We also worked on bring the new API to distributed model training (via MPI and
Kubernetes). This work is ongoing. We will release more about it in the next
version.
New Features
- We release new Python API.
- Deep Learning 101 book in English and Chinese.
- Support rectangle input for CNN.
- Support stride pooling for seqlastin and seqfirstin.
- Expose
seq_concat_layer/seq_reshape_layerintrainer_config_helpers. - Add dataset package: CIFAR, MNIST, IMDB, WMT14, CONLL05, movielens, imikolov.
- Add Priorbox layer for Single Shot Multibox Detection.
- Add smooth L1 cost.
- Add data reader creator and data reader decorator for v2 API.
- Add the CPU implementation of cmrnorm projection.
Improvements
- Support Python virtualenv for
paddle_trainer. - Add pre-commit hooks, used for automatically format our code.
- Upgrade protobuf to version 3.x.
- Add an option to check data type in Python data provider.
- Speedup the backward of average layer on GPU.
- Documentation refinement.
- Check dead links in documents using Travis-CI.
- Add a example for explaining
sparse_vector. - Add ReLU in layer_math.py
- Simplify data processing flow for Quick Start.
- Support CUDNN Deconv.
- Add data feeder in v2 API.
- Support predicting the samples from sys.stdin for sentiment demo.
- Provide multi-proccess interface for image preprocessing.
- Add benchmark document for v1 API.
- Add ReLU in
layer_math.py. - Add packages for automatically downloading public datasets.
- Rename
Argument::sumCosttoArgument::sumsince classArgumentis nothing with cost. - Expose Argument::sum to Python
- Add a new
TensorExpressionimplementation for matrix-related expression evaluations. - Add lazy assignment for optimizing the calculation of a batch of multiple expressions.
- Add abstract calss
Functionand its implementation:PadFuncandPadGradFunc.ContextProjectionForwardFuncandContextProjectionBackwardFunc.CosSimBackwardandCosSimBackwardFunc.CrossMapNormalFuncandCrossMapNormalGradFunc.MulFunc.
- Add class
AutoCompareandFunctionCompare, which make it easier to write unit tests for comparing gpu and cpu version of a function. - Generate
libpaddle_test_main.aand remove the main function inside the test file. - Support dense numpy vector in PyDataProvider2.
- Clean code base, remove some copy-n-pasted code snippets:
- Extract
RowBufferclass forSparseRowMatrix. - Clean the interface of
GradientMachine. - Use
overridekeyword in layer. - Simplify
Evaluator::create, useClassRegisterto createEvaluators.
- Extract
- Check MD5 checksum when downloading demo's dataset.
- Add
paddle::Errorwhich intentially replaceLOG(FATAL)in Paddle.
Bug Fixes
- Check layer input types for
recurrent_group. - Don't run
clang-formatwith .cu source files. - Fix bugs with
LogActivation. - Fix the bug that runs
test_layerHelpersmultiple times. - Fix the bug that the seq2seq demo exceeds protobuf message size limit.
- Fix the bug in dataprovider converter in GPU mode.
- Fix a bug in
GatedRecurrentLayer. - Fix bug for
BatchNormwhen testing more than one models. - Fix broken unit test of paramRelu.
- Fix some compile-time warnings about
CpuSparseMatrix. - Fix
MultiGradientMachineerror whentrainer_count > batch_size. - Fix bugs that prevents from asynchronous data loading in
PyDataProvider2.