-
Notifications
You must be signed in to change notification settings - Fork 263
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multi-task settings #48
Comments
Hi. You can use it of course for any sequence tagging task. The POS and chunking code is just an example, which you can modify to be used for your use case. |
I am really confused about the architecture of multi task framework as there is no diagram in the original paper. Could you please explain as to which layers are being shared? In the example that you have given pos is at the lower level because it appears first in the constructor? Each task has its own Bi-LSTM-CRF task specific network, which of the layers are being shared? Can you show this graphically? |
In the Train_MultiTask.py example, the POS and chunking network both share the embedding layer and one LSTM layer. If you change params like this
Both networks would share 2 stacked LSTM layers, the first with 100 recurrent units, the second with 50. In that file, only the CRF is task specific. If you run the code, the model architecture is also printed. Shared layers have the name shared_..., while task specific layers have the name POS_... and chunking_.... The order in the datasets dict doesn't matter. In Train_MultiTask_Different_Levels.py, the POS layer uses the output from the first LSTM layer, while chunking has a task specific LSTM layer:
Both networks have one shared LSTM (LSTM-Size), then pos uses 'Softmax' on top of that shared LSTM layer. Chunking in contrast uses an LSTM with 50 units and then a CRF. |
OK thanks for the detailed answer. Could also explain the difference between "Train_Multitask" and Train_Multitask_different_levels"? |
Have a look at this paper: Train_Multitask_different_levels implements the ideas from that paper. |
Thanks for the link. |
Hi @nreimers
For the multi-task framework, does it always have to be pos and chunking? or it could be any sequence labelling tasks?
The text was updated successfully, but these errors were encountered: