|
| 1 | +# LLM/AI Model Schema Artifact Types |
| 2 | + |
| 3 | +This example demonstrates how to extend Apicurio Registry with support for AI/ML-related artifact types |
| 4 | +using the JavaScript/TypeScript custom artifact type system. The example implements two artifact types |
| 5 | +for managing LLM schemas and prompt templates. |
| 6 | + |
| 7 | +## Overview |
| 8 | + |
| 9 | +Apicurio Registry can be extended to support custom artifact types by: |
| 10 | +1. Implementing artifact type functions in JavaScript/TypeScript |
| 11 | +2. Configuring the registry to load your custom artifact types |
| 12 | +3. Deploying the registry with the custom configuration |
| 13 | + |
| 14 | +This example includes: |
| 15 | +- **MODEL_SCHEMA**: AI/ML model input/output schema definitions and metadata |
| 16 | +- **PROMPT_TEMPLATE**: Version-controlled prompt templates with variable schemas |
| 17 | + |
| 18 | +## Artifact Types |
| 19 | + |
| 20 | +### MODEL_SCHEMA |
| 21 | + |
| 22 | +Defines and validates AI/ML model input/output schemas and metadata. |
| 23 | + |
| 24 | +**Content Types**: `application/json`, `application/x-yaml` |
| 25 | + |
| 26 | +**Key Features**: |
| 27 | +- Auto-detection from content structure (`modelId` + `input`/`output` fields) |
| 28 | +- JSON Schema validation for input/output definitions |
| 29 | +- Backward compatibility checking (cannot remove required fields, change types) |
| 30 | +- Canonicalization for consistent comparisons |
| 31 | +- Reference resolution for `$ref` schemas |
| 32 | + |
| 33 | +**Example**: |
| 34 | +```json |
| 35 | +{ |
| 36 | + "$schema": "https://apicur.io/schemas/model-schema/v1", |
| 37 | + "modelId": "gpt-4-turbo", |
| 38 | + "provider": "openai", |
| 39 | + "version": "2024-01", |
| 40 | + "input": { |
| 41 | + "type": "object", |
| 42 | + "properties": { |
| 43 | + "messages": { "type": "array" }, |
| 44 | + "temperature": { "type": "number", "minimum": 0, "maximum": 2 } |
| 45 | + }, |
| 46 | + "required": ["messages"] |
| 47 | + }, |
| 48 | + "output": { |
| 49 | + "type": "object", |
| 50 | + "properties": { |
| 51 | + "choices": { "type": "array" }, |
| 52 | + "usage": { "type": "object" } |
| 53 | + } |
| 54 | + }, |
| 55 | + "metadata": { |
| 56 | + "contextWindow": 128000, |
| 57 | + "capabilities": ["chat", "function_calling", "vision"] |
| 58 | + } |
| 59 | +} |
| 60 | +``` |
| 61 | + |
| 62 | +### PROMPT_TEMPLATE |
| 63 | + |
| 64 | +Version-controlled prompt templates with variable schemas for LLMOps. |
| 65 | + |
| 66 | +**Content Types**: `application/x-yaml`, `application/json`, `text/x-prompt-template` |
| 67 | + |
| 68 | +**Key Features**: |
| 69 | +- Template variable extraction and validation (`{{variable}}` syntax) |
| 70 | +- Variable schema definitions with types, constraints, and defaults |
| 71 | +- Backward compatibility (cannot remove variables, change types) |
| 72 | +- Output schema specification |
| 73 | +- Metadata for model recommendations and token estimation |
| 74 | + |
| 75 | +**Example**: |
| 76 | +```yaml |
| 77 | +$schema: https://apicur.io/schemas/prompt-template/v1 |
| 78 | +templateId: summarization-v1 |
| 79 | +name: Document Summarization |
| 80 | +version: "1.0" |
| 81 | + |
| 82 | +template: | |
| 83 | + Style: {{style}} |
| 84 | + Maximum length: {{max_words}} words |
| 85 | +
|
| 86 | + Document: {{document}} |
| 87 | +
|
| 88 | + Please provide a {{style}} summary. |
| 89 | +
|
| 90 | +variables: |
| 91 | + style: |
| 92 | + type: string |
| 93 | + enum: [concise, detailed, bullet-points] |
| 94 | + default: concise |
| 95 | + max_words: |
| 96 | + type: integer |
| 97 | + minimum: 50 |
| 98 | + maximum: 1000 |
| 99 | + default: 200 |
| 100 | + document: |
| 101 | + type: string |
| 102 | + required: true |
| 103 | + |
| 104 | +metadata: |
| 105 | + recommendedModels: [gpt-4-turbo, claude-3-opus] |
| 106 | +``` |
| 107 | +
|
| 108 | +## Prerequisites |
| 109 | +
|
| 110 | +- Node.js 18+ and npm (for building the TypeScript implementation) |
| 111 | +- Docker and Docker Compose |
| 112 | +- curl and jq (for running the demo script) |
| 113 | +
|
| 114 | +## Quick Start |
| 115 | +
|
| 116 | +### 1. Build the Custom Artifact Types |
| 117 | +
|
| 118 | +First, build the JavaScript libraries from the TypeScript source: |
| 119 | +
|
| 120 | +```bash |
| 121 | +npm install |
| 122 | +npm run build |
| 123 | +``` |
| 124 | + |
| 125 | +This creates: |
| 126 | +- `dist/model-schema-artifact-type.js` |
| 127 | +- `dist/prompt-template-artifact-type.js` |
| 128 | + |
| 129 | +### 2. Start the Registry |
| 130 | + |
| 131 | +Start Apicurio Registry with the custom artifact types: |
| 132 | + |
| 133 | +```bash |
| 134 | +docker compose up |
| 135 | +``` |
| 136 | + |
| 137 | +Wait for the services to be ready (check with `docker compose logs -f`). |
| 138 | + |
| 139 | +### 3. Run the Demo |
| 140 | + |
| 141 | +Execute the demo script to see the custom artifact types in action: |
| 142 | + |
| 143 | +```bash |
| 144 | +./demo.sh |
| 145 | +``` |
| 146 | + |
| 147 | +The demo demonstrates: |
| 148 | +- Listing available artifact types (MODEL_SCHEMA, PROMPT_TEMPLATE) |
| 149 | +- Creating artifacts with explicit types |
| 150 | +- Content auto-detection |
| 151 | +- Backward compatibility checking |
| 152 | +- Content validation |
| 153 | + |
| 154 | +### 4. Explore the Registry |
| 155 | + |
| 156 | +- **Web UI**: http://localhost:8888 |
| 157 | +- **REST API**: http://localhost:8080/apis/registry/v3 |
| 158 | + |
| 159 | +## Configuration |
| 160 | + |
| 161 | +### artifact-types-config.json |
| 162 | + |
| 163 | +Configures the custom artifact types: |
| 164 | + |
| 165 | +```json |
| 166 | +{ |
| 167 | + "includeStandardArtifactTypes": true, |
| 168 | + "artifactTypes": [ |
| 169 | + { |
| 170 | + "artifactType": "MODEL_SCHEMA", |
| 171 | + "name": "Model Schema", |
| 172 | + "description": "AI/ML model input/output schema definitions", |
| 173 | + "contentTypes": ["application/json", "application/x-yaml"], |
| 174 | + "scriptLocation": "/custom-artifact-types/dist/model-schema-artifact-type.js", |
| 175 | + "contentAccepter": { "type": "script" }, |
| 176 | + "contentValidator": { "type": "script" }, |
| 177 | + "compatibilityChecker": { "type": "script" }, |
| 178 | + "contentCanonicalizer": { "type": "script" }, |
| 179 | + "contentDereferencer": { "type": "script" }, |
| 180 | + "referenceFinder": { "type": "script" } |
| 181 | + } |
| 182 | + ] |
| 183 | +} |
| 184 | +``` |
| 185 | + |
| 186 | +### Docker Compose |
| 187 | + |
| 188 | +The Docker Compose file mounts the configuration and JavaScript libraries: |
| 189 | + |
| 190 | +```yaml |
| 191 | +volumes: |
| 192 | + - ./artifact-types-config.json:/custom-artifact-types/artifact-types-config.json:ro |
| 193 | + - ./dist:/custom-artifact-types/dist:ro |
| 194 | +environment: |
| 195 | + APICURIO_ARTIFACT_TYPES_CONFIG_FILE: "/custom-artifact-types/artifact-types-config.json" |
| 196 | +``` |
| 197 | +
|
| 198 | +## Compatibility Rules |
| 199 | +
|
| 200 | +### MODEL_SCHEMA Compatibility |
| 201 | +
|
| 202 | +When backward compatibility is enabled: |
| 203 | +
|
| 204 | +| Change | Allowed | |
| 205 | +|--------|---------| |
| 206 | +| Add optional input property | Yes | |
| 207 | +| Add required input property | No | |
| 208 | +| Remove input property | No | |
| 209 | +| Change input property type | No | |
| 210 | +| Add output property | Yes | |
| 211 | +| Remove output property | No | |
| 212 | +| Change output property type | No | |
| 213 | +
|
| 214 | +### PROMPT_TEMPLATE Compatibility |
| 215 | +
|
| 216 | +When backward compatibility is enabled: |
| 217 | +
|
| 218 | +| Change | Allowed | |
| 219 | +|--------|---------| |
| 220 | +| Add optional variable | Yes | |
| 221 | +| Remove unused variable | Yes | |
| 222 | +| Remove used variable | No | |
| 223 | +| Change variable type | No | |
| 224 | +| Make optional variable required | No | |
| 225 | +| Narrow enum values | No | |
| 226 | +| Change template text (same variables) | Yes | |
| 227 | +
|
| 228 | +## Use Cases |
| 229 | +
|
| 230 | +### LLMOps / Model Governance |
| 231 | +
|
| 232 | +- Track model input/output schemas across versions |
| 233 | +- Ensure backward compatibility when updating models |
| 234 | +- Document model capabilities and limitations |
| 235 | +- Manage pricing and metadata |
| 236 | +
|
| 237 | +### Prompt Engineering |
| 238 | +
|
| 239 | +- Version-controlled prompt templates |
| 240 | +- Variable validation ensures consistent usage |
| 241 | +- Track prompt evolution over time |
| 242 | +- Team collaboration on prompts |
| 243 | +
|
| 244 | +### RAG Pipelines |
| 245 | +
|
| 246 | +- Store embedding model configurations |
| 247 | +- Manage vector store settings |
| 248 | +- Link prompts to recommended models |
| 249 | +
|
| 250 | +## Sample Schemas |
| 251 | +
|
| 252 | +The `sample-schemas/` directory contains example artifacts: |
| 253 | + |
| 254 | +- `gpt4-model-schema.json` - OpenAI GPT-4 Turbo model schema |
| 255 | +- `claude-model-schema.json` - Anthropic Claude 3 Opus model schema |
| 256 | +- `summarization-prompt.yaml` - Document summarization prompt template |
| 257 | +- `qa-prompt.yaml` - Question & Answer RAG prompt template |
| 258 | + |
| 259 | +## Cleanup |
| 260 | + |
| 261 | +To stop and remove the containers: |
| 262 | + |
| 263 | +```bash |
| 264 | +docker compose down |
| 265 | +``` |
| 266 | + |
| 267 | +## Development |
| 268 | + |
| 269 | +### Project Structure |
| 270 | + |
| 271 | +``` |
| 272 | +llm-artifact-types/ |
| 273 | +├── package.json |
| 274 | +├── tsconfig.json |
| 275 | +├── tsconfig-build.json |
| 276 | +├── src/ |
| 277 | +│ ├── ModelSchemaArtifactType.ts |
| 278 | +│ └── PromptTemplateArtifactType.ts |
| 279 | +├── dist/ # Built JavaScript files |
| 280 | +├── artifact-types-config.json |
| 281 | +├── docker-compose.yml |
| 282 | +├── demo.sh |
| 283 | +├── README.md |
| 284 | +└── sample-schemas/ |
| 285 | + ├── gpt4-model-schema.json |
| 286 | + ├── claude-model-schema.json |
| 287 | + ├── summarization-prompt.yaml |
| 288 | + └── qa-prompt.yaml |
| 289 | +``` |
| 290 | +
|
| 291 | +### Building |
| 292 | +
|
| 293 | +```bash |
| 294 | +npm run clean # Remove dist directory |
| 295 | +npm run build # Compile TypeScript and bundle |
| 296 | +``` |
| 297 | + |
| 298 | +### TypeScript Types |
| 299 | + |
| 300 | +Uses `@apicurio/artifact-type-builtins` for type definitions: |
| 301 | + |
| 302 | +```typescript |
| 303 | +import type { |
| 304 | + ContentAccepterRequest, |
| 305 | + ContentValidatorRequest, |
| 306 | + ContentValidatorResponse, |
| 307 | + CompatibilityCheckerRequest, |
| 308 | + CompatibilityCheckerResponse |
| 309 | +} from '@apicurio/artifact-type-builtins'; |
| 310 | +``` |
| 311 | + |
| 312 | +## References |
| 313 | + |
| 314 | +- [Apicurio Registry Documentation](https://www.apicur.io/registry/) |
| 315 | +- [Custom Artifact Types Blog Post](https://www.apicur.io/blog/2025/10/27/custom-artifact-types) |
| 316 | +- [TOML Example Implementation](../custom-artifact-type/) |
| 317 | +- [Model Cards for Model Reporting](https://arxiv.org/abs/1810.03993) |
| 318 | +- [Prompt Engineering Best Practices](https://platform.openai.com/docs/guides/prompt-engineering) |
| 319 | + |
| 320 | +## Future Enhancements |
| 321 | + |
| 322 | +- **RAG_CONFIG**: Configuration for RAG pipelines (embedding models, vector stores) |
| 323 | +- **Cross-references**: Link PROMPT_TEMPLATE to MODEL_SCHEMA |
| 324 | +- **UI Enhancements**: Custom icons and views for AI artifacts |
| 325 | +- **SDK Support**: Java SDK helpers for working with AI schemas |
0 commit comments