Times and TransposeTimes

A * B
Times (A, B, outputRank=1)
TransposeTimes (A, B, outputRank=1)

The A * B operation has rich semantics for matrices and tensors.

If A and B are rank-2 or rank-1 tensors, A * B will compute the common matrix product.

To compute the matrix product A^T * B (with ^T denoting transposition), you could use Transpose (A) * B, but the special function TransposeTimes (A, B) is more efficient (but there is no corresponding efficient version of A * Transpose (B)).

If A and/or B are tensors of higher rank, the * operation denotes a generalized matrix product where all but the first dimension of A must match with the leading dimensions of B, and are interpreted by flattening. For example a product of a [I x J x K] and a [J x K x L] tensor (which we will abbreviate henceforth as [I x J x K] * [J x K x L]) gets reinterpreted by reshaping the two tensors as matrices as [I x (J * K)] * [(J * K) x L], for which the matrix product is defined and yields a result of dimension [I x L]. This makes sense if one considers the rows of a weight matrix to be patterns that activation vectors are matched against. The above generalization allows these patterns themselves to be multi-dimensional, such as images or running windows of speech features.

It is also possible to have more than one non-matched dimension in B. For example [I x J] * [J x K x L] is interpreted as this matrix product: [I x J] * [J x (K * L)] which thereby yields a result of dimensions [I x K x L]. For example, this allows to apply a matrix to all vectors inside a rolling window of L speech features of dimension J.

If the result of the product should have multiple dimensions (such as arranging a layer's activations as a 2D field), then instead of using the * operator, one must say Times (A, B, outputRank=m) where m is the number of dimensions in which the 'patterns' are arranged, and which are kept in the output. For example, Times (tensor of dim [I x J x K], tensor of dim [K x L], outputRank=2) will be interpreted as the matrix product [(I * J) x K] * [K x L] and yield a result of dimensions [I x J x L].

New Documentation Site

Iteration Plans

Times and TransposeTimes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!