-
Notifications
You must be signed in to change notification settings - Fork 4.3k
Times and TransposeTimes
A * B
Times (A, B, outputRank=1)
TransposeTimes (A, B, outputRank=1)
The A * B
operation has rich semantics for matrices and tensors.
If A
and B
are rank-2 or rank-1 tensors, A * B
will compute the common matrix product.
To compute the matrix product A^T * B
(with ^T
denoting transposition), you could use Transpose (A) * B
, but the special function TransposeTimes (A, B)
is more efficient (but there is no corresponding efficient version of A * Transpose (B)
).
If A
and/or B
are tensors of higher rank, the *
operation denotes a generalized matrix product where all but the first dimension of A
must match with the leading dimensions of B
, and are interpreted by flattening. For example a product of a [I x J x K]
and a [J x K x L]
tensor (which we will abbreviate henceforth as [I x J x K] * [J x K x L]
) gets reinterpreted by reshaping the two tensors as matrices as [I x (J * K)] * [(J * K) x L]
, for which the matrix product is defined and yields a result of dimension [I x L]
. This makes sense if one considers the rows of a weight matrix to be patterns that activation vectors are matched against. The above generalization allows these patterns themselves to be multi-dimensional, such as images or running windows of speech features.
It is also possible to have more than one non-matched dimension in B
. For example [I x J] * [J x K x L]
is interpreted as this matrix product: [I x J] * [J x (K * L)]
which thereby yields a result of dimensions [I x K x L]
. For example, this allows to apply a matrix to all vectors inside a rolling window of L
speech features of dimension J
.
If the result of the product should have multiple dimensions (such as arranging a layer's activations as a 2D field), then instead of using the *
operator, one must say Times (A, B, outputRank=m)
where m
is the number of dimensions in which the 'patterns' are arranged, and which are kept in the output. For example, Times (tensor of dim [I x J x K], tensor of dim [K x L], outputRank=2)
will be interpreted as the matrix product [(I * J) x K] * [K x L]
and yield a result of dimensions [I x J x L]
.