DRCN模型中5层lstm的stacked的输入疑问 #8

chenmozxh · 2019-11-22T12:38:11Z

我理解论文中公式6的意思是，第l层t时刻的输入为（1）第l-1层t时刻隐向量，（2）第l-1层的attention向量，（3）第l-1层t时刻的输入，三者contact起来为第l层t时刻的输入。
而代码是如下：

` for j in range(5):
with tf.variable_scope(f'p_lstm_{i}{j}', reuse=None):
p_state, _ = self.BiLSTM(tf.concat(p_state, axis=-1))
with tf.variable_scope(f'p_lstm{i}_{j}' + str(i), reuse=None):
h_state, _ = self.BiLSTM(tf.concat(h_state, axis=-1))

            p_state = tf.concat(p_state, axis=-1)
            h_state = tf.concat(h_state, axis=-1)
            # attention
            cosine = tf.divide(tf.matmul(p_state, tf.matrix_transpose(h_state)),
                               (tf.norm(p_state, axis=-1, keep_dims=True) * tf.norm(h_state, axis=-1, keep_dims=True)))
            att_matrix = tf.nn.softmax(cosine)
            p_attention = tf.matmul(att_matrix, h_state)
            h_attention = tf.matmul(att_matrix, p_state)

            # DesNet
            p = tf.concat((p, p_state, p_attention), axis=-1)
            h = tf.concat((h, h_state, h_attention), axis=-1)

`

所以，第j层的输入应该是p，而不是p_state
不知道我理解的对不对

还有一个细节，5层stacked的bilstm的输出，是要和原始字词的embedding拼接给到下一个5层stacked的bilstm？论文图1是这么画的，文字的话，好像没有提这一点
论文中还有一个pooling结构，在4个5层bilstm后面，输出如果是（30,100）的话（30个词, 每个词的embedding是100维），则进行按列进行max-pooling成100维的p、q向量，然后进行公示7的拼接，在进行3层dense。

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DRCN模型中5层lstm的stacked的输入疑问 #8

DRCN模型中5层lstm的stacked的输入疑问 #8

chenmozxh commented Nov 22, 2019 •

edited

Loading

DRCN模型中5层lstm的stacked的输入疑问 #8

DRCN模型中5层lstm的stacked的输入疑问 #8

Comments

chenmozxh commented Nov 22, 2019 • edited Loading

chenmozxh commented Nov 22, 2019 •

edited

Loading