A template to create and work with Masterdata definitions in the BAM Data Store.
- Click on Use this template button on the top right and create a new Masterdata repository.
- The next screen will prompt you to chose a Github organization or profile for the new repository, as well as a name. We recommend you naming the repository starting with
masterdata-, so it is easy to be recognized. - After creating your new repository, you can:
- Open in Github Codespace
- Clone it locally
- You need to install
cruft. - Run the following command to create the structure of the new Masterdata project:
cruft create https://github.com/BAMresearch/masterdata-generatorThis will prompt you with some questions to be filled in:
[1/8] Name of the main author (Erika Mustermann):
[2/8] Email of the main author ([email protected]):
[3/8] Github organization or profile name (BAMResearch):
[4/8] The name of the folder containing the generated files <project-name> (masterdata-example):
[5/8] The name of the folder src/<module_name>, default is the <project_name> separated with underscores (masterdata_example):
[6/8] Description of the project (A short description for the masterdata project.):
[7/8] Do you want to add Python schema files? [y/n] (y):
[8/8] Do you want to add Excel schema files? [y/n] (y):Note
Not filling anything on the questions above will result on chosing the default. The default value for each question is defined in between parenthesis next to the question.
- The files generated by
cruftare created under./<project-name>. To move all the structure one level up, run:
python <project-name>/move_generated_files.py- After filling the questionnaire and running the
move_generated_files.pyscript, you will have a new repository with:
masterdata-<name>/
├── LICENSE (MIT)
├── README.md
├── pyproject.toml
├── src/
│ └── masterdata-<name>/
# if Python is selected as an option
│ ├── __init__.py
│ ├── object_types.py
│ ├── vocabulary_types.py
# if Excel is selected as an option
│ ├── masterdata.xlsx
└── tests/
├── __init__.py
└── test.py- If you are using Python, you can install the package and its dependencies by creating a virtual environment (either with
condaorvenv) and running:
python3 -m venv .venv
source .venv/bin/activate
pip install --upgrade pip
pip install uv
uv pip install -e '.[dev]'Note
Step 8 is defined using venv for creating a virtual environment in an Ubuntu OS. Please, modify the commands if you use conda or you are installing in Windows/MacOS.
Once you have the skeleton of the Masterdata repository, you can define new classes in the corresponding file under src/<module-name>/.
In case of Python, these are different modules for Object Types and Vocabularies. For Excel or RDF/XML, this is a single file named masterdata.xlsx/masterdata.owl, respectively. In all the cases, a dummy entity called ExampleObjectType is created to serve as an example.
You can export from Python to the other formats using the bam-masterdata CLI. You can also generate Python code from Excel or RDF/XML with the same CLI. Read more in the bam-masterdata documentation.
A set of Github actions are defined in .github/workflows/. These will:
- Ensure typing and formatting consistency in Python code following PEP-8 standards using
mypyandruffrules defined in thepyproject.toml. - Check consistency of the Masterdata definitions using the
bam-masterdatachecker CLI functionalities.
If the pipeline passes, it means your Masterdata definitions are formatted properly and do not clash with the current bam-masterdata definitions.
Important
In bam-masterdata, there is a set of Masterdata entities already defined. These correspond to the general definitions (to be used across MSE sub-communities). If you want to know how to define new Masterdata entities using inheritance from the general ones, read the corresponding documentation.
The action .github/workflows/publish.yml already implements the publishing of your repository to PyPI. When you are ready, you need:
- Define a secret and name it
PYPI_API_TOKENfrom pypi.org. - Create a Release from the Github page of your repository.
Read more here: Github PyPI documentation.
