Skip to content

Commit 4695c6f

Browse files
author
Evelin Amorim
committed
Adding code files to documentation
1 parent ebf6a44 commit 4695c6f

10 files changed

+86
-68
lines changed
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
import text2story as t2s
2+
3+
t2s.add_annotator("custom_annotator", ['fr'], ['participant', 'time'])
+7
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
import text2story as t2s
2+
3+
text = 'O rei morreu na batalha. A rainha casou com seu irmão.'
4+
my_narrative = t2s.Narrative('pt', text, '2024')
5+
my_narrative.extract_events('srl')
6+
7+
print(my_narrative.events)
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
import text2story as t2s
2+
3+
text = 'The king died in battle. The queen married his brother.'
4+
my_narrative = t2s.Narrative('en', text, '2024')
5+
my_narrative.extract_participants('nltk')
6+
7+
print(my_narrative.participants)
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
import text2story as t2s
2+
3+
text = 'O rei morreu na batalha. A rainha casou com seu irmão.'
4+
my_narrative = t2s.Narrative('pt', text, '2024')
5+
my_narrative.extract_participants('srl', 'spacy')
6+
7+
print(my_narrative.participants)
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
import text2story as t2s
2+
3+
text = 'O rei morreu na batalha. A rainha casou com seu irmão.'
4+
my_narrative = t2s.Narrative('pt', text, '2024')
5+
6+
my_narrative.extract_events('srl')
7+
my_narrative.extract_participants('spacy', 'srl')
8+
my_narrative.extract_times('py_heideltime')
9+
10+
my_narrative.extract_semantic_role_links('srl')

docs/source/examples/extract_time.py

+6
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
import text2story as t2s
2+
text = 'The traveling salesman went town to town. However, he did not sell one book.'
3+
my_narrative = t2s.Narrative('en', text, '2024')
4+
my_narrative.extract_times('py_heideltime')
5+
6+
print(my_narrative.times)

docs/source/examples/load_models.py

+3
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
import text2story as t2s
2+
3+
t2s.start('en')
+11
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
from text2story.readers.read_brat import ReadBrat
2+
3+
reader = ReadBrat()
4+
5+
# in the BRAT file format, you have a txt file and a corresponding ann file,
6+
# both have the same name. For instance, in this example, data/doc1.txt is the
7+
# raw text file, and data/doc1.ann contains the annotations. But we only provide the
8+
# name without to specify the extension.
9+
doc = reader.process_file("data/doc1")
10+
for tok in doc:
11+
print(tok.text)
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
import text2story as t2s
2+
3+
t2s.start("fr")
4+
text_ = "Que retenir de la visite d'État d'Emmanuel Macron en Allemagne?"
5+
my_narrative = t2s.Narrative('fr', text_, "2024")
6+
7+
my_narrative.extract_participants("custom_annotator")
8+
9+
print(my_narrative.participants)

docs/source/usage.rst

+23-68
Original file line numberDiff line numberDiff line change
@@ -42,24 +42,14 @@ annotations, if they exists. This is also a class defined in readers module.
4242

4343
Next, a code example to read a directory with annotations in BRAT format.
4444

45-
.. literalinclude:: examples/read_brat_dir
45+
.. literalinclude:: examples/read_brat_dir.py
4646
:language: python
4747

4848

4949
The next code illustrate how to use `ReadBrat` to read only one file.
5050

51-
.. code-block:: python
52-
from text2story.readers.read_brat import ReadBrat
53-
reader = ReadBrat()
54-
55-
# in the BRAT file format, you have a txt file and a corresponding ann file,
56-
# both have the same name. For instance, in this example, data/doc1.txt is the
57-
# raw text file, and data/doc1.ann contains the annotations. But we only provide the
58-
# name without to specify the extension.
59-
doc = reader.process_file("data/doc1")
60-
for tok in doc:
61-
print(tok.text)
62-
51+
.. literalinclude:: examples/read_brat_file.py
52+
:language: python
6353

6454

6555
The Annotators Module
@@ -71,9 +61,9 @@ are all naturally integrated in our pipeline. The second type is composed by ann
7161
anyone can built and integrate in our pipeline. For both, it is required to load the models for the
7262
language of the used examples. The code bellow is used to load the models for the English language.
7363

74-
.. code-block:: python
75-
import text2story as t2s
76-
t2s.start('en')
64+
.. literalinclude:: examples/load_models.py
65+
:language: python
66+
7767

7868
.. note::
7969

@@ -98,27 +88,18 @@ participants using the NLTK module. Others modules that employs NER to identify
9888
(en_core_web_lg/'en', pt_core_news_lg/'pt') and BERTNERPT (https://huggingface.co/arubenruben/NER-PT-BERT-CRF-Conll2003).
9989
Bellow, an example of using only NLTK to extract participants from a narrative.
10090

101-
.. code-block:: python
102-
import text2story as t2s
103-
text = 'The king died in battle. The queen married his brother.'
104-
my_narrative = t2s.Narrative('en', text, '2024')
105-
my_narrative.extract_participants('nltk')
106-
107-
print(my_narrative.participants)
91+
.. literalinclude:: examples/extract_participants_en.py
92+
:language: python
10893

10994
The ALLENNLP ('en') and SRL('pt') modules employ Semantic Role Labeling modules to identify participants and the
11095
code for them is the same as above, only changing the name of the module.
11196

11297
It is also possible to use pipeline models to obtain better or different results. The code below extracts
11398
participants from a narrative text in Portuguese using SPACY and SRL modules.
11499

115-
.. code-block:: python
116-
import text2story as t2s
117-
text = 'O rei morreu na batalha. A rainha casou com seu irmão.'
118-
my_narrative = t2s.Narrative('pt', text, '2024')
119-
my_narrative.extract_participants('srl','spacy')
100+
.. literalinclude:: examples/extract_participants_pt.py
101+
:language: python
120102

121-
print(my_narrative.participants)
122103

123104

124105
Time
@@ -128,13 +109,8 @@ For time expression, text2story has py_heideltime and tei2go to identify time ex
128109
English languages. The code is similar to the extraction of participants. See the example bellow:
129110

130111

131-
.. code-block:: python
132-
import text2story as t2s
133-
text = 'The traveling salesman went town to town. However, he did not sell one book.'
134-
my_narrative = t2s.Narrative('en', text, '2024')
135-
my_narrative.extract_times('py_heideltime')
136-
137-
print(my_narrative.times)
112+
.. literalinclude:: examples/extract_time.py
113+
:language: python
138114

139115

140116

@@ -144,30 +120,17 @@ Events
144120
There are only two modules devoted to the extraction of events, ALLENNLP ('en') and SRL ('pt'). The extraction of
145121
events is done in the same way as the extraction of time and participants. See the code below.
146122

147-
.. code-block:: python
148-
import text2story as t2s
149-
text = 'O rei morreu na batalha. A rainha casou com seu irmão.'
150-
my_narrative = t2s.Narrative('pt', text, '2024')
151-
my_narrative.extract_events('srl')
152-
153-
print(my_narrative.events)
123+
.. literalinclude:: examples/extract_events.py
124+
:language: python
154125

155126
Semantic Links
156127
''''''''
157128

158129
Semantic links can only be extracted after the extraction of events, participants, and time. So, the code below
159130
updates the example code from the extraction of events.
160131

161-
.. code-block:: python
162-
import text2story as t2s
163-
text = 'O rei morreu na batalha. A rainha casou com seu irmão.'
164-
my_narrative = t2s.Narrative('pt', text, '2024')
165-
166-
my_narrative.extract_events('srl')
167-
my_narrative.extract_participants('spacy','srl')
168-
my_narrative.extract_times('py_heideltime')
169-
170-
my_narrative.extract_semantic_role_links('srl')
132+
.. literalinclude:: examples/extract_semantic_links.py
133+
:language: python
171134

172135

173136
Custom Annotators
@@ -178,29 +141,21 @@ load function. The main goal of this method is to load the models used in its pi
178141
following implementation of a custom annotator that uses tei2go French model to extract time expressions, and
179142
the spacy NER French model to extract participants.
180143

181-
.. literalinclude:: custom_annotator.py
144+
.. literalinclude:: examples/custom_annotator.py
182145
:language: python
183146

184147
To use your new annotator, first, you need to add it to the text2story pipeline using the following code.
185148

186-
.. code-block:: python
187-
import text2story as t2s
188-
189-
t2s.add_annotator("custom_annotator", ['fr'], ['participant', 'time'])
149+
.. literalinclude:: examples/add_custom_annotator.py
150+
:language: python
190151

191152
Then, you can use the annotator like the native ones. See the code below.
192153

193-
.. code-block:: python
194-
import text2story as t2s
195-
196-
t2s.start("fr")
197-
text_ = "Que retenir de la visite d'État d'Emmanuel Macron en Allemagne?"
198-
my_narrative = t2s.Narrative('fr',text_,"2024")
199-
200-
my_narrative.extract_participants("custom_annotator")
201-
202-
print(my_narrative.participants)
154+
.. literalinclude:: examples/test_custom_annotator.py
155+
:language: python
203156

204157
.. The Visualization Module
205158
.. -----
206159
160+
161+

0 commit comments

Comments
 (0)