-
Notifications
You must be signed in to change notification settings - Fork 4
MatrixTDBProcedures
MatrixTDB is the regression test facility for the Grammar Matrix and the Matrix customization system. It allows us to create gold standard tsdb++ profiles on demand for language types defined in choices files.
There are three main things you might want to do with MatrixTDB: put data in or get data out. Add new strings, add a language type, extract a profile for a language type. These three high-level tasks break down into smaller sub-tasks. The breakdown into sub-tasks is displayed here, while the Detailed Processes section of this page breaks each of those down into smaller tasks.
- Create a source profile
- Import the source profile
- Add permutes
- Run specific filters
- (Actually this just breaks down to one sub-task: adding a language type
- Import the language type
- Generate a profile
This section describes step-by-step instructions on how to perform various tasks and sub-tasks with MatrixTDB.
Matrix developers should update MatrixTDB as follows:
-
Determine the harvester strings you'll need to illustrate your library.
-
Determine the semantically neutral variants your library allows for each harvester string, at the level of bags of words. For example, the basic lexical library allows for case-marking adpositions. So p-nom can be added to any string with an overt subject to get a semantically equivalent string, provided p-nom is in the right place.
-
update customize/sql_profiles/stringmods.py to reflect the modifications.
-
Create a harvester grammar to process you harvester strings with. Save the choices file from that grammar.
-
Create a file listing the harvester strings and their mrs_tags (see customize/sql_profiles/harv_str/harv_mrs_1 for an example).
-
Create a file with just the harvester strings.
-
Start the LKB and tsdb++
-
Load the harvester grammar in to the LKB.
-
In tsdb++ to File > Import > Test items to import the harvester strings.
-
Make sure tsdb++ is set to write the mrs field.
-
Process the items you imported with your grammar.
-
The resulting profile will be your source_profile.
-
Next, run customize/sql_profiles/import_from_itsdb.py with your source_profile, choices file and harv_mrs file as arguments.
-
Get the resulting osp_id
-
Then run customize/sql_profiles/add_permutes.py and give it the osp_id
-
Update the universal and specific filters in u_filters.py and s_filters.py
-
Run run_u_filters.py
-
Run the SQL query that separates the universally ungrammatical from universally grammatical results.
-
Run run_specific_filters.py
At this point, MatrixTDB is up to date. We can also use import_from_tsdb.py to update the mrss we want to have corresponding to particular mrs_tags.
To export a profile corresopnding to a given choices file:
- _FIX_ME_ instructions here.
TODO:
- Work out how to run filters recursively for coordination et al
- Update filters for coordination
- Update filters for inflection version of question particles
Home | Forum | Discussions | Events