page_type | languages | products | urlFragment | name | description | ||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
sample |
|
|
enterprise-azureai |
Azure OpenAI Service as a central capability with Azure API Management |
Unleash the power of Azure OpenAI in your company in a secure and manageable way with Azure API Management and Azure Developer CLI |
Unleash the power of Azure OpenAI in your company in a secure & manageable way with Azure API Management and Azure Developer CLI (azd
).
This repository provides guidance and tools for organizations looking to implement Azure OpenAI in a production environment with an emphasis on cost control, secure access, and usage monitoring. The aim is to enable organizations to effectively manage expenses while ensuring that the consuming application or team is accountable for the costs incurred.
Note
This repository uses an AI Proxy to load-balance & log the traffic between Azure API Management and Azure OpenAI Service. In May 2024 Microsoft announced new features in Azure API Management Policies related to integrating with Azure OpenAI Service that overlap with the AI Proxy. We recommend to use the new features in Azure API Management Policies for new deployments, but if you need to implement customizations or additional features in the proxy, the AI Proxy is still very relevant. For azd
implementation guidance on the new features in Azure API Management Policies, see here.
- Infrastructure-as-code: Bicep templates for provisioning and deploying the resources.
- CI/CD pipeline: GitHub Actions and Azure DevOps Pipelines for continuous deployment of the resources to Azure.
- Secure Access Management: Best practices and configurations for managing secure access to Azure OpenAI Services.
- Usage Monitoring & Cost Control: Solutions for tracking the usage of Azure OpenAI Services to facilitate accurate cost allocation and team charge-back.
- Load Balance: Utilize & loadbalance the capacity of Azure OpenAI across regions or provisioned throughput (PTU)
- Streaming requests: Support for streaming requests to Azure OpenAI, for all features (e.g. additional logging and charge-back)
- End-to-end sample: Including Sample ChatApp, Azure Dashboards, content filters and policies
Read more: Architecture in detail
- Infrastructure-as-code (IaC) Bicep files under the
infra
folder that demonstrate how to provision resources and setup resource tagging for azd. - A dev container configuration file under the
.devcontainer
directory that installs infrastructure tooling by default. This can be readily used to create cloud-hosted developer environments such as GitHub Codespaces or a local environment via a VSCode DevContainer. - Continuous deployment workflows for CI providers such as GitHub Actions under the
.github
directory, and Azure Pipelines under the.azdo
directory that work for most use-cases. - The .NET 8.0 chargeback proxy application under the
src
folder. - The NodeJS Sample ChatApp application under the
src
folder.
- Azure Developer CLI
- Azure CLI
- .NET 8.0 SDK
- Docker Desktop
- Node.js v18.17 or higher
- jq required on Mac and Linux
azd init -t Azure/enterprise-azureai
If you already cloned this repository to your local machine or run from a Dev Container or GitHub Codespaces you can run the following command from the root folder.
azd init
It will prompt you to provide a name that will later be used in the name of the deployed resources. If you're not logged into Azure, it will also prompt you to first login.
azd auth login
This repository uses environment variables to configure the deployment, which can be used to enable optional features. You can set these variables with the azd env set
command. Learn more about all optional features here.
azd env set USE_REDIS_CACHE_APIM '<true-or-false>'
azd env set SECONDARY_OPENAI_LOCATION '<your-secondary-openai-location>'
In the azd template, we automatically set an environment variable for your current IP address. During deployment, this allows traffic from your local machine to the Azure Container Registry for deploying the containerized application.
Note
To determine your IPv4 address, the service icanhazip.com is being used. To control the IPv4 addresss used directly (without the service), edit the MY_IP_ADDRESS field in the .azure<name>.env file. This file is created after azd init. Without a properly configured IP address, azd up will fail.
azd up
It will prompt you to login, pick a subscription, and provide a location (like "eastus"). We've added an extra conditional parameter to deploy the Sample ChatApp, for demo-ing purposes.
Read more: Sample ChatApp
Then it will provision the resources in your account and deploy the latest code.
Note
Because Azure OpenAI isn't available in all regions, you might get an error when you deploy the resources. You can find more information about the availability of Azure OpenAI here.
For more details on the deployed services, see additional details below.
Note
Sometimes the DNS zones for the private endpoints aren't created correctly / in time. If you get an error when you deploy the resources, you can try to deploy the resources again.
You can enable Azure Redis Cache to improve the performance of Azure API Management. To enable this feature, set the USE_REDIS_CACHE_APIM
environment variable to true
.
azd env set USE_REDIS_CACHE_APIM 'true'
Note
Deployment of Azure Redis Cache can take up to 30 minutes.
You can enable a secondary Azure OpenAI location to improve the availability of Azure OpenAI. To enable this feature, set the SECONDARY_OPENAI_LOCATION
environment variable to the location of your choice.
azd env set SECONDARY_OPENAI_LOCATION '<your-secondary-openai-location>'
This project includes a Github workflow and an Azure DevOps Pipeline for deploying the resources to Azure on every push to main. That workflow requires several Azure-related authentication secrets to be stored as Github action secrets. To set that up, run:
azd pipeline config
You can configure azd
to provision and deploy resources to your deployment environments using standard commands such as azd up
or azd provision
. When platform.type
is set to devcenter, all azd
remote environment state and provisioning uses dev center components. azd
uses one of the infrastructure templates defined in your dev center catalog for resource provisioning. In this configuration, the infra folder in your local templates isn’t used.
azd config set platform.type devcenter
The Sample ChatApp is a simple NodeJS application that uses the API Management endpoints, exposing Azure OpenAI Service, to test the deployment and see how the Azure OpenAI Service works. In the ChatApp you can configure which API Management Subscription you want to use and with which deployment model, creating an end-to-end experience.
The deployed resources include a Log Analytics workspace with an Application Insights based dashboard to measure metrics like server response time and failed requests. We also included some custom visuals in the dashboard to visualize the token usage per consumer of the Azure OpenAI Service.
To open that dashboard, run this command once you've deployed:
azd monitor --overview
To clean up all the resources you've created and purge the soft-deletes, simply run:
azd down --purge --force
The resource group and all the resources will be deleted and you'll be prompted if you want the soft-deletes to be purged.
A tests.http file with relevant tests you can perform is included, to check if your deployment is successful. You need the 2 subcription keys for Marketing and Finance, created in API Management in order to test the API. You can find more information about how to create subscription keys here.
After forking this repo, you can use this GitHub Action to enable CI/CD for your fork. Just adjust the README in your fork to point to your own GitHub repo.
GitHub Action | Status |
---|---|
azd Deploy |
The following section examines different concepts that help tie in application and infrastructure.
This repository illustrates how to integrate Azure OpenAI as a central capability within an organization using Azure API Management and Azure Container Apps. Azure OpenAI offers AI models for generating text, images, etc., trained on extensive data. Azure API Management facilitates secure and managed exposure of APIs to the external environment. Azure Container Apps allows running containerized applications in Azure without infrastructure management. The repository includes a .NET 8.0 proxy application to allocate Azure OpenAI Service costs to the consuming application, aiding in cost control. The proxy supports load balancing and horizontal scaling of Azure OpenAI instances. A chargeback report in the Azure Dashboard visualizes Azure OpenAI Service costs, making it a centralized capability within the organization.
We've used the Azure Developer CLI Bicep Starter template to create this repository. With azd
you can create a new repository with a fully functional CI/CD pipeline in minutes. You can find more information about azd
here.
One of the key points of azd
templates is that we can implement best practices together with our solution when it comes to security, network isolation, monitoring, etc. Users are free to define their own best practices for their dev teams & organization, so all deployments are followed by the same standards.
The best practices we've followed for this architecture are: Azure Integration Service Landingzone Accelerator and for Azure OpenAI we've used the blog post Azure OpenAI Landing Zone reference architecture. For the chargeback proxy we've used the setup from the Azure Container Apps Landingzone Accelerator.
When it comes to security, there are recommendations mentioned for securing your Azure API Management instance in the accelerators above. For example, with the use of Front Door or Application Gateway (see this repository), proving Layer 7 protection and WAF capabilities, and by implementing OAuth authentication on the API Management instance. How to implement OAuth authentication on API Management (see here repository).
We're also using Azure Monitor Private Link Scope. This allows us to define the boundaries of my monitoring network, and only allow traffic from within that network to my Log Analytics workspace. This is a great way to secure your monitoring network.
In order to provide an end-to-end experience and enabling user to demo from a GUI, we've included a Sample ChatApp. This is a simple NodeJS application based on the Azure Chat Solution Accelerator. It uses Azure Cosmos DB to store the chat messages and leverages Azure Key Vault to store the secrets used in the appliction.
Azure API Management is a fully managed service that enables customers to publish, secure, transform, maintain, and monitor APIs. It is a great way to expose your APIs to the outside world in a secure and manageable way.
Azure OpenAI is a service that provides AI models that are trained on a large amount of data. You can use these models to generate text, images, and more.
Managed identities allows you to secure communication between services. This is done without having the need for you to manage any credentials.
Azure Virtual Network allows you to create a private network in Azure. You can use this to secure communication between services.
Azure Private DNS Zone allows you to create a private DNS zone in Azure. You can use this to resolve hostnames in your private network.
Application Insights allows you to monitor your application. You can use this to monitor the performance of your application.
Log Analytics allows you to collect and analyze telemetry data from your application. You can use this to monitor the performance of your application.
Azure Monitor Private Link Scope allows you to define the boundaries of your monitoring network, and only allow traffic from within that network to your Log Analytics workspace. This is a great way to secure your monitoring network.
Azure Private Endpoint allows you to connect privately to a service powered by Azure Private Link. Private Endpoint uses a private IP address from your VNet, effectively bringing the service into your VNet.
Azure Container Apps allows you to run containerized applications in Azure without having to manage any infrastructure.
Azure Container Registry allows you to store and manage container images and artifacts in a private registry for all types of container deployments.
Azure Redis Cache allows you to use a secure open source Redis cache.
Azure Container Environment allows you to run containerized applications in Azure without having to manage any infrastructure.
Azure Cosmos DB allows you to use a fully managed NoSQL database for modern app development.
Azure Key Vault allows you to safeguard cryptographic keys and other secrets used by cloud apps and services.