The Språkbanken Text Metadata API is a RESTful web service that provides access to metadata for various resources maintained by Språkbanken Text, including corpora, lexicons, models, analyses, and utilities. The metadata is stored in YAML files in a separate metadata repository.
For more technical details please refer to the developer documentation.
Available API calls (please note that the URL contains the API version, e.g. /v3, /dev etc):
| Endpoint | Description |
|---|---|
| 📁 / | List all resources |
| 📁 /?resource-type=[resource-type] | List all resources of a specific type. Available types: corpus, lexicon, model, analysis, utility, collection |
| 📁 /list-ids | List all existing resource IDs |
| 🔍 /?resource=saldo | Retrieve a specific resource and its description (if available) |
| 🔍 /bibtex?resource=[resource-id] | Return BibTeX citation for the specified resource |
| 🔍 /check-id-availability?id=[resource-id] | Check if a given resource ID is available |
| 🔧 /renew-cache | Update all metadata files from git, re-process JSON, and update cache. |
| 🔧 /renew-cache?resource-paths=[resource-type]/[resource-id] | Update cache for specific resources, e.g.:resource-paths=corpus/attasidor,lexicon/saldo |
| 📘 /schema | Return JSON schema for resources |
| 📘 /openapi.json | Serve API documentation as JSON |
- Python 3.11 or newer
- Redis (used for Celery background tasks)
- Memcached (for optional caching, check caching.md for more info)
To install the dependencies, we recommend using uv.
-
Install uv if you don't have it already.
-
While in the metadata-api directory, run:
uv sync --no-install-project
This will create a virtual environment in the
.venvdirectory and install the dependencies listed inpyproject.toml.
Alternatively, you can set up a virtual environment manually using Python's built-in venv module and install the
dependencies using pip:
python3 -m venv .venv
source .venv/bin/activate
pip install -e .The default configuration is specified in metadata_api/settings.py. You can override these
settings using environment variables or by creating a local .env file in the project's root directory. Common
configuration options include:
LOG_LEVEL(default:INFO)LOG_TO_FILE(default:True): Logs always go to stdout; ifTrue, they are also saved tologs/metadata_api_<DATE>.log.ROOT_PATH: The root path for the API, e.g., "/metadata-api" if served from a subpath.METADATA_DIR: Absolute path to the directory containing the metadata YAML files.CELERY_BROKER_URL: URL for the Celery broker used for background tasks.MEMCACHED_SERVER: Host and port of the Memcached server, or path to the socket file.SLACK_WEBHOOK: URL to a Slack webhook for error notifications (optional).
Example .env file:
LOG_LEVEL=DEBUG
LOG_TO_FILE=False
ROOT_PATH="/metadata-api"
METADATA_DIR="/path-to-metadata-dir"
CELERY_BROKER_URL="redis://localhost:6379/1"
MEMCACHED_SERVER="localhost:11211" # Set to None to disable caching
SLACK_WEBHOOK="https://hooks.slack.com/services/..."For testing purposes, you can run the app using the following script (with an activated virtual environment, or by
prefixing with uv run). The default settings when using run.py are:
- Host/port:
127.0.0.1:8000 ENV=developmentLOG_LEVEL=DEBUGLOG_TO_FILE=False(logs to console only)reload=True(auto-restart on code changes)
python run.py [--host HOST] [--port PORT] [--log-level LOG_LEVEL]If you prefer to run the app with uvicorn, you can use the following command:
uvicorn metadata_api.main:appYou also need to have a running Celery worker for background tasks. You can start a worker with:
celery -A metadata_api.tasks worker --loglevel=INFOPlease note that you need to have a running Redis server for Celery to work.