@@ -42,24 +42,14 @@ annotations, if they exists. This is also a class defined in readers module.
42
42
43
43
Next, a code example to read a directory with annotations in BRAT format.
44
44
45
- .. literalinclude :: examples/read_brat_dir
45
+ .. literalinclude :: examples/read_brat_dir.py
46
46
:language: python
47
47
48
48
49
49
The next code illustrate how to use `ReadBrat ` to read only one file.
50
50
51
- .. code-block :: python
52
- from text2story.readers.read_brat import ReadBrat
53
- reader = ReadBrat()
54
-
55
- # in the BRAT file format, you have a txt file and a corresponding ann file,
56
- # both have the same name. For instance, in this example, data/doc1.txt is the
57
- # raw text file, and data/doc1.ann contains the annotations. But we only provide the
58
- # name without to specify the extension.
59
- doc = reader.process_file(" data/doc1" )
60
- for tok in doc:
61
- print (tok.text)
62
-
51
+ .. literalinclude :: examples/read_brat_file.py
52
+ :language: python
63
53
64
54
65
55
The Annotators Module
@@ -71,9 +61,9 @@ are all naturally integrated in our pipeline. The second type is composed by ann
71
61
anyone can built and integrate in our pipeline. For both, it is required to load the models for the
72
62
language of the used examples. The code bellow is used to load the models for the English language.
73
63
74
- .. code-block :: python
75
- import text2story as t2s
76
- t2s.start( ' en ' )
64
+ .. literalinclude :: examples/load_models.py
65
+ :language: python
66
+
77
67
78
68
.. note ::
79
69
@@ -98,27 +88,18 @@ participants using the NLTK module. Others modules that employs NER to identify
98
88
(en_core_web_lg/'en', pt_core_news_lg/'pt') and BERTNERPT (https://huggingface.co/arubenruben/NER-PT-BERT-CRF-Conll2003).
99
89
Bellow, an example of using only NLTK to extract participants from a narrative.
100
90
101
- .. code-block :: python
102
- import text2story as t2s
103
- text = ' The king died in battle. The queen married his brother.'
104
- my_narrative = t2s.Narrative(' en' , text, ' 2024' )
105
- my_narrative.extract_participants(' nltk' )
106
-
107
- print (my_narrative.participants)
91
+ .. literalinclude :: examples/extract_participants_en.py
92
+ :language: python
108
93
109
94
The ALLENNLP ('en') and SRL('pt') modules employ Semantic Role Labeling modules to identify participants and the
110
95
code for them is the same as above, only changing the name of the module.
111
96
112
97
It is also possible to use pipeline models to obtain better or different results. The code below extracts
113
98
participants from a narrative text in Portuguese using SPACY and SRL modules.
114
99
115
- .. code-block :: python
116
- import text2story as t2s
117
- text = ' O rei morreu na batalha. A rainha casou com seu irmão.'
118
- my_narrative = t2s.Narrative(' pt' , text, ' 2024' )
119
- my_narrative.extract_participants(' srl' ,' spacy' )
100
+ .. literalinclude :: examples/extract_participants_pt.py
101
+ :language: python
120
102
121
- print (my_narrative.participants)
122
103
123
104
124
105
Time
@@ -128,13 +109,8 @@ For time expression, text2story has py_heideltime and tei2go to identify time ex
128
109
English languages. The code is similar to the extraction of participants. See the example bellow:
129
110
130
111
131
- .. code-block :: python
132
- import text2story as t2s
133
- text = ' The traveling salesman went town to town. However, he did not sell one book.'
134
- my_narrative = t2s.Narrative(' en' , text, ' 2024' )
135
- my_narrative.extract_times(' py_heideltime' )
136
-
137
- print (my_narrative.times)
112
+ .. literalinclude :: examples/extract_time.py
113
+ :language: python
138
114
139
115
140
116
@@ -144,30 +120,17 @@ Events
144
120
There are only two modules devoted to the extraction of events, ALLENNLP ('en') and SRL ('pt'). The extraction of
145
121
events is done in the same way as the extraction of time and participants. See the code below.
146
122
147
- .. code-block :: python
148
- import text2story as t2s
149
- text = ' O rei morreu na batalha. A rainha casou com seu irmão.'
150
- my_narrative = t2s.Narrative(' pt' , text, ' 2024' )
151
- my_narrative.extract_events(' srl' )
152
-
153
- print (my_narrative.events)
123
+ .. literalinclude :: examples/extract_events.py
124
+ :language: python
154
125
155
126
Semantic Links
156
127
''''''''
157
128
158
129
Semantic links can only be extracted after the extraction of events, participants, and time. So, the code below
159
130
updates the example code from the extraction of events.
160
131
161
- .. code-block :: python
162
- import text2story as t2s
163
- text = ' O rei morreu na batalha. A rainha casou com seu irmão.'
164
- my_narrative = t2s.Narrative(' pt' , text, ' 2024' )
165
-
166
- my_narrative.extract_events(' srl' )
167
- my_narrative.extract_participants(' spacy' ,' srl' )
168
- my_narrative.extract_times(' py_heideltime' )
169
-
170
- my_narrative.extract_semantic_role_links(' srl' )
132
+ .. literalinclude :: examples/extract_semantic_links.py
133
+ :language: python
171
134
172
135
173
136
Custom Annotators
@@ -178,29 +141,21 @@ load function. The main goal of this method is to load the models used in its pi
178
141
following implementation of a custom annotator that uses tei2go French model to extract time expressions, and
179
142
the spacy NER French model to extract participants.
180
143
181
- .. literalinclude :: custom_annotator.py
144
+ .. literalinclude :: examples/ custom_annotator.py
182
145
:language: python
183
146
184
147
To use your new annotator, first, you need to add it to the text2story pipeline using the following code.
185
148
186
- .. code-block :: python
187
- import text2story as t2s
188
-
189
- t2s.add_annotator(" custom_annotator" , [' fr' ], [' participant' , ' time' ])
149
+ .. literalinclude :: examples/add_custom_annotator.py
150
+ :language: python
190
151
191
152
Then, you can use the annotator like the native ones. See the code below.
192
153
193
- .. code-block :: python
194
- import text2story as t2s
195
-
196
- t2s.start(" fr" )
197
- text_ = " Que retenir de la visite d'État d'Emmanuel Macron en Allemagne?"
198
- my_narrative = t2s.Narrative(' fr' ,text_," 2024" )
199
-
200
- my_narrative.extract_participants(" custom_annotator" )
201
-
202
- print (my_narrative.participants)
154
+ .. literalinclude :: examples/test_custom_annotator.py
155
+ :language: python
203
156
204
157
.. The Visualization Module
205
158
.. -----
206
159
160
+
161
+
0 commit comments