Chinese NER Using Lattice LSTM

Lattice LSTM for Chinese NER. Character based LSTM with Lattice embeddings as input.

Models and results can be found at our ACL 2018 paper Chinese NER Using Lattice LSTM. It achieves 93.18% F1-value on MSRA dataset, which is the state-of-the-art result on Chinese NER task.

Details will be updated soon.

Requirement:

Python: 3.6.5 
PyTorch: 0.4.1

(for PyTorch 0.3.1 (and 0.4.1), please refer issue#8 for a slight modification.)

Input format:

CoNLL format (prefer BIOES tag scheme), with each character its label for one line. Sentences are splited with a null line.

美	B-LOC
国	E-LOC
的	O
华	B-PER
莱	I-PER
士	E-PER

我	O
跟	O
他	O
谈	O
笑	O
风	O
生	O

Pretrained Embeddings:

The pretrained character and word embeddings are the same with the embeddings in the baseline of RichWordSegmentor

Character embeddings: gigaword_chn.all.a2b.uni.ite50.vec

Word(Lattice) embeddings: ctb.50d.vec

How to run the code?

Download the character embeddings and word embeddings and put them in the data folder.
Modify the run_main.py or run_demo.py by adding your train/dev/test file directory.
sh run_main.py or sh run_demo.py

Resume NER data

Crawled from the Sina Finance, it includes the resumes of senior executives from listed companies in the Chinese stock market. Details can be found in our paper.

Cite:

Please cite our ACL 2018 paper:

@article{zhang2018chinese,  
 title={Chinese NER Using Lattice LSTM},  
 author={Yue Zhang and Jie Yang},  
 booktitle={Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL)},
 year={2018}  
}

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
ResumeNER		ResumeNER
data		data
model		model
utils		utils
README.md		README.md
main.py		main.py
run_demo.sh		run_demo.sh
run_main.sh		run_main.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Chinese NER Using Lattice LSTM

Requirement:

Input format:

Pretrained Embeddings:

How to run the code?

Resume NER data

Cite:

About

Releases

Packages

Languages

BorisPolonsky/LatticeLSTM

Folders and files

Latest commit

History

Repository files navigation

Chinese NER Using Lattice LSTM

Requirement:

Input format:

Pretrained Embeddings:

How to run the code?

Resume NER data

Cite:

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages