Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finalize data model for encoding related subjects #101

Open
wlpotter opened this issue Jun 18, 2021 · 11 comments
Open

Finalize data model for encoding related subjects #101

wlpotter opened this issue Jun 18, 2021 · 11 comments
Assignees

Comments

@wlpotter
Copy link
Collaborator

@davidamichelson I am using 133 as my example as it had examples of all the categories of related subjects: https://caesarea-maritima.org/testimonia/133

I have elected to use a tei:listRef for this draft/demo. I don't think this use fits perfectly with the TEI's intention for this element, but I couldn't find anything better. tei:textClass could be ideal for the themes, but for the others (geography, prosopography, and related-texts) it may be too unwieldy to define and maintain a taxonomy.

Here is the XML snippet I created (/TEI/text/body/listRef, following the tei:bibl elements):

      <listRef type="related-texts">
         <ref target="#">John of Nikiu 66.2</ref>
      </listRef>
      <listRef type="geography">
         <ref target="https://pleiades.stoa.org/places/658381">Antioch on the Orontes</ref>
         <ref target="https://pleiades.stoa.org/places/1001940">Palestine</ref>
         <ref target="https://pleiades.stoa.org/places/629035">Caesarea Mazaca</ref>
         <ref target="https://pleiades.stoa.org/places/628949">Cappadocia</ref>
      </listRef>
      <listRef type="prosopography">
         <ref target="#">Augustus, emperor </ref>
         <ref target="#">Herod I, king</ref>
         <ref target="#">Archelaos</ref>
      </listRef>
      <listRef type="themes">
         <ref target="#">Hellenistic History</ref>
         <ref target="#">Roman History</ref>
         <ref target="#">Geography</ref>
         <ref target="#">Government and Law</ref>
      </listRef>

A few notes

  • I used Pleiades to fill in the targets in the geography section to use as a test, though per Joe's email we will have to figure out what to use for prosopography (is VIAF only for authors?) and for related-texts. For themes we will likely need to define our own taxonomy.
  • The # in the @ref attribute is merely a placeholder for validation purposes; we can leave those attributes empty if we want
  • From a data transformation standpoint it should be pretty easy to convert the notes to listRefs
  • For prosopography we talked about having our own place name versus getting it from Pleiades. I can't remember how we left that, but it might get a bit complicated (though tei:ref elements are relatively permissive with what is allowed in them: https://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-ref.html)
  • We will also need to deal with distinguishing between related-texts references to other testimonia and those not in our db.
@wlpotter
Copy link
Collaborator Author

wlpotter commented Aug 9, 2021

Here's a note from Joe about the prosopography links:

Quick question to ponder: when/if we install a "related subjects" content area in each record that leads users toward the various people mentioned in the testimonium ("prosopography"), I dont know of an online source for which we can insert URIs that parallels Pleiades for the places ("geography"). There might be something out there I dont know, but I dont think there is a PIR or PLRE on-line; I have seen the PBE out of King's College which is very nice. I hesitate to make links to Wikipedia or Encyclop Brit. Something to ponder?

@wlpotter
Copy link
Collaborator Author

wlpotter commented Aug 9, 2021

I will make a separate issue (#108) for cleaning up these tei:note elements before we decide how to finalize their encoding.

@wlpotter wlpotter changed the title Encoding related subjects Finalize data model for encoding related subjects Feb 3, 2022
@wlpotter
Copy link
Collaborator Author

wlpotter commented Feb 7, 2022

After #103 is fixed, I can dump a list of the in-use related-subjects terms for the editors to look at, revise as needed, and turn into a controlled vocabulary. This can be used in other records that have not had their subjects encoded; and it can be used to create taxonomy URIs for browsing, etc. using, perhaps, tei:relation elements.

By the way, we could also cross-walk with our taxonomy keyword URIs if we wanted to.

@davidamichelson
Copy link
Collaborator

Data has been updated by #108

@wlpotter
Copy link
Collaborator Author

wlpotter commented Sep 15, 2022

To implement this:

  • Make a spreadsheet of existing related subjects. Have editors/students review to reconcile duplicates and then update data with controlled and normalized terms
  • Then match controlled terms with:
    • related-texts use our own testimonia
    • geography use Pleiades
    • prosopography use wikidata
    • themes us LCSH
  • update existing data with the matched URIs
  • create an app issue for the facet functionality
  • update the input form to have a dropdown menu to select related subject terms
    • the post-processing script will add the URI
    • we will have the dropdown allow user suggestions, which we will then need to catch and output a warning that those weren't found in the URI lookup

@wlpotter
Copy link
Collaborator Author

FYI spreadsheet is here: https://docs.google.com/spreadsheets/d/1pdVVVt11X-lmiRAs6D6e9Yiis8ZqN11JrHvPFcO_CW8/edit?usp=sharing

The following will be needed:

  • review the 'distinct' tabs for each related subject type to catch variants of the same item (place names, especially). This will let me run a batch update that normalizes the controlled vocabulary term (as long as we don't delete the 'CurrentName' columns)
  • add the URIs to those distinct tabs
    • One trick here will be the cases in the 'related texts' that are either not in the Testimonia database or are for the work-level rather than testimonia. I think we may want to rethink our sources? Could just use the Perseus Catalog but not sure of coverage

@wlpotter
Copy link
Collaborator Author

@davidamichelson do we want to bring this up again in January? The data is currently still just in //body/note/p elements and could use some normalization

@davidamichelson
Copy link
Collaborator

@wlpotter Yes, let's implement this change to list/item, great suggestion.

@wlpotter
Copy link
Collaborator Author

wlpotter commented Mar 8, 2023

Note to self: update this to include a desc element in each listRef that provides a human-readable name of the category for displaying it on the frontend. E.g. //listRef[@type="related-texts"] would have <desc>Related Texts</desc>

wlpotter added a commit that referenced this issue Mar 8, 2023
@wlpotter
Copy link
Collaborator Author

wlpotter commented Mar 8, 2023

Note: record 381 has been updated by hand as an example for display; I've retained the old model of encoding as a comment, so revert back to this before batch updating to catch any normalization of the subject terms

@wlpotter wlpotter moved this from Todo to In Progress in Caesarea-Maritima Mar 15, 2023
@wlpotter
Copy link
Collaborator Author

wlpotter commented Jun 2, 2023

@wlpotter wlpotter moved this from In Progress to Todo in Caesarea-Maritima Jun 30, 2023
@wlpotter wlpotter moved this from Todo to Backlog in Caesarea-Maritima Jul 7, 2023
@wlpotter wlpotter removed this from the 1.0 Release milestone Jul 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Backlog
Development

No branches or pull requests

2 participants