-
Notifications
You must be signed in to change notification settings - Fork 4.3k
BrainScript Network Builder
Custom networks are described in CNTK's custom network description language "BrainScript." You need to use the BrainScript network builder which has the parameters listed below. Detailed description on the network description language can be found on the Basic Concepts page and the corresponding subpages. There are two forms of using the BrainScript network builder. To describe your network in an external file, specify a block similar to this:
BrainScriptNetworkBuilder = (new ComputationNetwork [
include "yourNetwork.bs"
])
where yourNetwork.bs contains the network described using BrainScript. In the above form, yourNetwork.bs is searched for first in the same directory as the config file, and if not found, in the directory of the CNTK executable. Both absolute and relative pathnames are accepted here. E.g., bs/yourNetwork.bs means a file located in a directory bs next to your config file (or alternatively in the CNTK executable directory).
Alternatively, you can define your network inline, inside the config file. This can simplify configuration if you don't plan to share the same brain script across multiple configurations. Use this form:
BrainScriptNetworkBuilder = [
# insert network description here
]
Note that this is merely a short-hand for:
BrainScriptNetworkBuilder = (new ComputationNetwork [
# insert network description here
])
In older versions of CNTK, the network builder was called NDLNetworkBuilder. NDLNetworkBuilder is now deprecated.
Converting an existing network definition for the NDLNetworkBuilder to BrainScriptNetworkBuilder is simple in most cases. The main changes are the surrounding syntax. The core network description itself is largely upwards compatible and likely identical or near-identical if you don't take advantage of the new language features.
To convert your descriptions, you must switch the network builder, adapt w.r.t. outer syntax, and possibly make minor adaptations to your network code itself.
Step 1. Switching the network builder. Replace the NDLNetworkBuilder with the corresponding BrainScriptNetworkBuilder block in the CNTK config file. If your network description is in a separate file:
# change from:
NDLNetworkBuilder = [
ndlMacros = "shared.ndl" # (if any)
networkDescription = "yourNetwork.ndl"
]
# ...to:
BrainScriptNetworkBuilder = (new ComputationNetwork [
include "shared.bs" # (if any)
include "yourNetwork.bs"
])
(The change of filename extension is not strictly necessary but recommended.)
If your network description is in the .cntk config file itself:
# change from:
NDLNetworkBuilder = [
# macros
load = [
SigmoidNetwork (x, W, b) = Sigmoid (Plus (Times (W, x), b))
]
# network description
run = [
feat = Input (13)
...
ce = CrossEntropyWithSoftmax (labels, z, tag="criterion")
]
]
# ...to:
BrainScriptNetworkBuilder = [
# macros are just defined inline
SigmoidNetwork (x, W, b) = Sigmoid (Plus (Times (W, x), b)) # or: Sigmoid (W * x + b)
# network description
feat = Input (13)
...
ce = CrossEntropyWithSoftmax (labels, z, tag="criterion")
]
Step 2. Remove load and run blocks. With BrainScriptNetworkBuilder, macro/function definitions and main code are combined. The load and run blocks must simply be removed. For example, this:
load = ndlMnistMacros
run = DNN
ndlMnistMacros = [
featDim = 784
...
labels = InputValue(labelDim)
]
DNN = [
hiddenDim = 200
...
outputNodes = (ol)
]
simply becomes:
featDim = 784
...
labels = InputValue(labelDim)
hiddenDim = 200
...
outputNodes = (ol)
You may have used the run variable to select one of multiple configurations with an external variable, e.g.:
NDLNetworkBuilder = [
run = $whichModel$ # outside parameter selects model, must be either "model1" or "model2"
model1 = [ ... (MODEL 1 DEFINITION) ]
model2 = [ ... (MODEL 1 DEFINITION) ]
]
This pattern was mostly necessary because NDL did not have conditional expressions. In BrainScript, this would now be written with an if expression:
BrainScriptNetworkBuilder = (new ComputationNetwork
if $whichModel$ == "model1" then [ ... (MODEL 1 DEFINITION) ]
else if $whichModel$ == "model2" then [ ... (MODEL 2 DEFINITION) ]
else Fail("Invalid model selector value '$whichModel$'")
)
However, often, the selected models are very similar, so a better way would be to merge their descriptions and instead use conditionals inside only for where they differ. Here is an example where a parameter is used to choose between a unidirectional and a bidirectional LSTM:
encoderFunction =
if useBidirectionalEncoder
then BS.RNNs.RecurrentBirectionalLSTMPStack
else BS.RNNs.RecurrentLSTMPStack
encoder = encoderFunction (encoderDims, inputEmbedded, inputDim=inputEmbeddingDim)
Step 3. Adjust your network description. Regarding the network description (formulas) itself, BrainScript is largely upwards compatible with NDL. These are the main differences:
-
The return value of macros (functions) is no longer the last variable defined in them, but the entire set of variables. You must explicitly select the output value at the end. For example:
# NDL: f(x) = [ x2 = x*x y = x2 + 1 ] # <-- return value defaults to last entry, i.e. y # BrainScript: f(x) = [ x2 = x*x y = x2 + 1 ].y # <-- return value y must be explicitly dereferencedWithout this change, the function return value would be the entire record, and the typical error you will get is that a
ComputationNodewas expected where aComputationNetworkwas found. -
BrainScript does not allow functions with variable numbers of parameters. This matters primarily for the
Parameter()function: A vector parameter can no longer be written asParameter(N), it now has to be explicitly written as a 1-column matrixParameter(N, 1). Without this change, you will get an error about mismatching number of positional parameters. Alternatively, one can use the tensor form,ParameterTensor(N).It also matters for the
RowStack()function, which in BrainScript takes a single parameter that is an array of inputs. The inputs must be separated by a colon (:) instead of a comma, e.g.RowStack (a:b:c)instead ofRowStack (a, b, c). -
Some defaults have been updated, primarily the optional
imageLayoutparameter ofConvolution(), the pooling operations, andImageInput(). For NDL, these defaulted tolegacy, whereas now the default iscudnnwhich is required to be compatible with the cuDNN convolution primitives. (All code samples explicitly specify this parameter ascudnnalready.) -
BrainScript's parser is more restrictive:
- Identifiers are now case-sensitive. Built-in functions use PascalCase (e.g.
RectifiedLinear), and built-in variables and parameter names use camelCase (e.g.modelPath,criterionNodes), as do option strings (init="fixedValue",tag="criterion"). - Abbreviated alternative names are no longer allowed, such as
Const()should beConstant(),tag="eval"should betag="evaluation". - Some mis-spelled names were corrected:
criteriais nowcriterion(likewisecriterionNodes),defaultHiddenActivityis nowdefaultHiddenActivation. - The
=sign is no longer optional for function definitions. - It is no longer allowed to use curly braces for blocks (
{ ... }), only brackets ([ ... ]) are allowed. - Option labels must be quoted as strings, e.g.
init="uniform"rather thaninit=uniform(without the quotes, BrainScript would fail with an error message saying that the symboluniformis unknown).
This more restricted syntax is still accepted by
NDLNetworkBuilder, so we recommend to first make these syntactical changes and test them with NDL, before actually changing to BrainScript. - Identifiers are now case-sensitive. Built-in functions use PascalCase (e.g.
Step 4. Remove NDLNetworkBuilder from "write" and "test" sections. Please review your "write" and "test" sections for NDLNetworkBuilder sections, and remove them. Some of our stock examples have extraneous NDLNetworkBuilder sections that should not be there. If your configuration is based on one of these examples, you may have such sections as well. They used to be ignored. But with the BrainScript update, defining a new network in these sections now has a meaning (model editing), so they are no longer ignored and therefore should be removed.
The syntax of the deprecated NDLNetworkBuilder is:
NDLNetworkBuilder = [
networkDescription = "yourNetwork.ndl"
]
The NDLNetworkBuilder block has the following parameters:
-
networkDescription: the file path of the network description file. With the deprecatedNDLNetworkBuilder, it was customary to use the file extension.ndl. If there is nonetworkDescriptionparameter specified then the network description is assumed to be inlined in the sameNDLNetworkBuildersubblock, specified with therunparameter below. Note that only one file path may be specified via thenetworkDescriptionparameter. To load multiple files of macros, use thendlMacrosparameter. -
run: the block of the NDL that will be executed. If an external NDL file is specified via thenetworkDescriptionparameter, therunparameter identifies a block in that file. This parameter overrides anyrunparameters that may already exist in the file. If nonetworkDescriptionfile is specified, therunparameter identifies a block in the current configuration file. -
load: the blocks of NDL scripts to load. Multiple blocks can be specified via a ":" separated list. The blocks specified by theloadparameter typically contain macros for use by therunblock. Similar to therunparameter, theloadparameter identifies blocks in an external NDL file and overrides anyloadparameters that may already exist in the file, if a file is specified by thenetworkDescriptionparameter. If nonetworkDescriptionfile is specified,loadidentifies a block in the current configuration file. -
ndlMacros: the file path where NDL macros may be loaded. This parameter is usually used to load a default set of NDL macros that can be used by all NDL scripts. Multiple NDL files, each specifying different sets of macros, can be loaded by specifying a "+" separated list of file paths for thisndlMacrosparameter. In order to share macros with other command blocks such as NDL's model-editing language (MEL) blocks, you should define it at the root level of the configuration file. -
randomSeedOffset: a non-negative random seed offset value in initializing the learnable parameters. The default value is0. This allows users to run experiments with different random initialization.