-
Notifications
You must be signed in to change notification settings - Fork 0
/
summary-head.html
28 lines (25 loc) · 2.56 KB
/
summary-head.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
<html>
<link rel="stylesheet" type="text/css" href="../css/summary.css"/>
<head><title>ELTeC summary</title></head>
<body>
<a href="https://www.distant-reading.net/"><img src="../media/distantreading.png" alt="logo"/></a>
<h3>ELTeC Summary Page</h3>
<p class="prefix">As well as the following summary statistics, this
page provides links to human-readable versions of each text currently
included in the European Literary Text Collection
(<a href="https://www.distant-reading.net/eltec/">ELTeC</a>). Click on
a language code in the table below to see a list of texts now
available in that language. Then click on the identifier of a text to
see a simple rendering of the text as produced
by <a href="https://github.com/TEIC/CETEIcean">CETEIcean</a>. The
original source files are stored in a GitHub repository at <a href="
https://github.com/COST-ELTeC">COST-ELTeC</a>, and may be downloaded
freely from there.
</p>
<p>The following tables list three different flavours of ELTeC corpus. All ELTeC corpora are encoded in TEI-XML according to one of the
<a href="https://distantreading.github.io/Schema/eltec-1.html">ELTeC schemas.</a> The ELTeC <b>core</b> corpora are, as far as possible, comparable in size and composition. Each contains a balanced selection of 100 texts respecting all the criteria defined for the ELTeC project. The ELTeC <b>plus</b> corpora contain smaller collections of texts which cover the same period of time as the core corpora, but which do not meet the balance criteria defined for the project: in some cases, the criteria simply could not be satisfied because the required mixture of texts did not exist; in other cases, future iterations of the collection may contain additional texts. The ELTeC <b>extended</b> corpora are ELTeC-conformant in their encoding, but selected according to different design criteria, either to provide additional texts for the same time period, or to provide coverage of a different time period.</p>
<p>The E5C column gives the <a href="https://github.com/distantreading/WG1/wiki/E5C-discussion-paper">conformance score</a> calculated for each repository and is displayed in green if conformance is high. The other columns give counts for each of the four balance criteria, with numbers in red indicating
that this criterion is unsatisfied. Hovering over the last figure in each column displays the E5C score calculated for that criterion.
</p> <p>This remains a work in progress! Comments and reports of any
problems are much appreciated: send them to the <a href="https://github.com/distantreading/WG1/issues">WG1 Issue Tracker</a>.
</p>