We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
我理解论文中公式6的意思是,第l层t时刻的输入为(1)第l-1层t时刻隐向量,(2)第l-1层的attention向量,(3)第l-1层t时刻的输入, 三者contact起来为第l层t时刻的输入。 而代码是如下:
` for j in range(5): with tf.variable_scope(f'p_lstm_{i}{j}', reuse=None): p_state, _ = self.BiLSTM(tf.concat(p_state, axis=-1)) with tf.variable_scope(f'p_lstm{i}_{j}' + str(i), reuse=None): h_state, _ = self.BiLSTM(tf.concat(h_state, axis=-1))
p_state = tf.concat(p_state, axis=-1) h_state = tf.concat(h_state, axis=-1) # attention cosine = tf.divide(tf.matmul(p_state, tf.matrix_transpose(h_state)), (tf.norm(p_state, axis=-1, keep_dims=True) * tf.norm(h_state, axis=-1, keep_dims=True))) att_matrix = tf.nn.softmax(cosine) p_attention = tf.matmul(att_matrix, h_state) h_attention = tf.matmul(att_matrix, p_state) # DesNet p = tf.concat((p, p_state, p_attention), axis=-1) h = tf.concat((h, h_state, h_attention), axis=-1)
`
所以,第j层的输入应该是p,而不是p_state 不知道我理解的对不对
还有一个细节,5层stacked的bilstm的输出,是要和原始字词的embedding拼接给到下一个5层stacked的bilstm?论文图1是这么画的,文字的话,好像没有提这一点 论文中还有一个pooling结构,在4个5层bilstm后面,输出如果是(30,100)的话(30个词, 每个词的embedding是100维),则进行按列进行max-pooling成100维的p、q向量,然后进行公示7的拼接,在进行3层dense。
The text was updated successfully, but these errors were encountered:
No branches or pull requests
我理解论文中公式6的意思是,第l层t时刻的输入为(1)第l-1层t时刻隐向量,(2)第l-1层的attention向量,(3)第l-1层t时刻的输入, 三者contact起来为第l层t时刻的输入。
而代码是如下:
` for j in range(5):
with tf.variable_scope(f'p_lstm_{i}{j}', reuse=None):
p_state, _ = self.BiLSTM(tf.concat(p_state, axis=-1))
with tf.variable_scope(f'p_lstm{i}_{j}' + str(i), reuse=None):
h_state, _ = self.BiLSTM(tf.concat(h_state, axis=-1))
`
所以,第j层的输入应该是p,而不是p_state
不知道我理解的对不对
还有一个细节,5层stacked的bilstm的输出,是要和原始字词的embedding拼接给到下一个5层stacked的bilstm?论文图1是这么画的,文字的话,好像没有提这一点
论文中还有一个pooling结构,在4个5层bilstm后面,输出如果是(30,100)的话(30个词, 每个词的embedding是100维),则进行按列进行max-pooling成100维的p、q向量,然后进行公示7的拼接,在进行3层dense。
The text was updated successfully, but these errors were encountered: