Deploy a complete, pre-configured AI inference stack with both a chat UI and an OpenAI-compatible API endpoint in minutes on Linode GPU instances.
Get started quickly by deploying to a clean Linode GPU instance. See the Quick Start Guide for step-by-step instructions.
- One-Click Deployment: Fully automated setup via cloud-init
- Complete AI Stack: Includes both a web-based chat interface and an OpenAI-compatible API
- Pre-Configured: NVIDIA drivers, Docker, and all dependencies pre-installed
- Fast Time-to-Value: From instance boot to working AI in under 5 minutes
- Pre-Configured Model: Defaults to Mistral 7B Instruct
- OpenAI-Compatible API: Drop-in replacement for OpenAI endpoints—just change your
BASE_URL
AI Quickstart - Mistral LLM consists of two containerized services working together to provide a complete AI inference stack. See the Architecture Documentation for detailed information.
- A Linode account with GPU access enabled
- Note: GPU instances require GPU access to be enabled on your account. If you don't see GPU instance types available, please contact Linode Support to enable GPU access.
- Local system requirements:
bash(version 4.0+)curl(for API calls)jq(for JSON parsing)ssh(for instance access)netcat(nc) (for connectivity checks)
- Authentication (one of the following):
- Linode CLI configured:
pip install linode-cli && linode-cli configure - OR use OAuth authentication (handled automatically by deployment script)
- Linode CLI configured:
Try Mistral-7B in a chat interface without writing code or paying per-token API fees.
Get a stable, OpenAI-compatible API endpoint. Point your existing application to your own endpoint by simply changing the BASE_URL.
Use the chat UI to experiment with prompts, then use the same underlying API in your application for consistent results.
Deploy everything in one command:
./scripts/deploy.shThe script will guide you through:
- API authentication (linode-cli or OAuth)
- GPU availability fetching (dynamically fetched from API)
- Region selection
- GPU instance type selection
- Instance labeling
- Root password configuration
- SSH key configuration
- Deployment confirmation
- Instance creation
- Automated deployment monitoring and health checks
See the Quick Start Guide for step-by-step instructions. The guide covers:
- Prerequisites and setup
- Deploying to a clean Linode GPU instance
- Accessing your services after deployment
- Troubleshooting common issues
See Scripts Documentation for detailed script usage and options.
See the Security Guide for detailed firewall setup instructions and security best practices.
Common maintenance tasks including updating services, changing models, viewing logs, and troubleshooting are covered in the Maintenance Guide.
- No automatic API authentication (use firewall)
- No user accounts for the UI (open by default)
- No automatic HTTPS/SSL
- Inference only (no fine-tuning support)
- Quick Start Guide - Get started with deployment
- Scripts Documentation - Deployment script usage and options
- Architecture - System architecture and technical details
- API Usage - API reference and integration examples
- Security Guide - Security best practices and firewall setup
- Maintenance Guide - Updating services, changing models, troubleshooting
- Non-Interactive Mode - Design for CI/CD automation
For issues or feature requests, please open an issue in this repository.
This project is licensed under the MIT License. See the LICENSE file for details.
Status: Draft v1.0