Skip to content

Commit e0ade1f

Browse files
authored
Merge pull request #672 from YashPandit4u/mie-api-payload-approach-samples
Multiple Inference Endpoints Samples and Guide
2 parents 9f31f49 + 52d3915 commit e0ade1f

File tree

5 files changed

+338
-0
lines changed

5 files changed

+338
-0
lines changed
Lines changed: 100 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,100 @@
1+
# Multiple Inference Endpoints (MIE) - API Approach Samples
2+
3+
## Overview
4+
5+
Oracle Cloud Infrastructure (OCI) Data Science now supports **Multiple Inference Endpoints (MIE)**, allowing you to expose your model deployments at custom endpoints beyond the standard `/predict` endpoint. This feature enables you to:
6+
7+
- **Use OpenAI-compatible API specifications**
8+
- **Define custom endpoints** tailored to your specific use cases
9+
10+
11+
12+
## Prerequisites
13+
14+
Before you begin, ensure you have:
15+
16+
1. **OCI SDK for Python** installed (currently supported in preview sdk version)
17+
```bash
18+
pip install --trusted-host=artifactory.oci.oraclecorp.com -i https://artifactory.oci.oraclecorp.com/api/pypi/global-dev-pypi/simple -U oci==2.160.2+preview.1.278
19+
```
20+
21+
2. **OCI Configuration** set up
22+
- A valid OCI config file at `~/.oci/config`
23+
- Required credentials: tenancy, user, fingerprint, key_file, and optionally pass_phrase
24+
- For security token authentication: `security_token_file` configured
25+
26+
3. **Additional dependencies** (for streaming inference)
27+
```bash
28+
pip install sseclient-py requests
29+
```
30+
31+
4. **Required OCIDs**:
32+
- Project OCID
33+
- Compartment OCID
34+
- Model OCID (if creating a new deployment)
35+
36+
37+
## Quick Start
38+
39+
### Step 1: Create a Model Deployment with Multiple Endpoints
40+
41+
Use `create_mie_md.py` to create a model deployment with custom inference endpoints:
42+
43+
```python
44+
python create_mie_md.py
45+
```
46+
47+
**Key Configuration Options:**
48+
49+
- **`predict_api_specification`**: Set to `"openai"` for OpenAI-compatible endpoints
50+
- **`custom_http_endpoints`**: Define custom endpoint URI suffixes and supported HTTP methods
51+
```python
52+
custom_http_endpoints=[
53+
oci.data_science.models.InferenceHttpEndpoint(
54+
endpoint_uri_suffix="/v1/custom/completions",
55+
http_methods=[HTTPMethod.GET, HTTPMethod.POST]
56+
)
57+
]
58+
```
59+
60+
**Before running**, update the following in `create_mie_md.py`:
61+
- `project_id`: Your Data Science project OCID
62+
- `compartment_id`: Your compartment OCID
63+
- `model_id`: Your model OCID
64+
- `image` and `image_digest`: Your container image details
65+
- `display_name` and `description`: Descriptive names for your deployment
66+
67+
### Step 2: Update an Existing Model Deployment
68+
69+
Use `update_mie_md.py` to modify an existing model deployment's endpoints:
70+
71+
```python
72+
python update_mie_md.py
73+
```
74+
75+
**Before running**, update:
76+
- `model_deployment_id`: The OCID of your existing model deployment
77+
- Any configuration parameters you wish to modify (endpoints, API specification, etc.)
78+
79+
### Step 3: Make Inference Calls
80+
81+
#### Standard (Non-Streaming) Inference
82+
83+
Use `inference_mie.py` for standard inference calls:
84+
85+
```python
86+
python inference_mie.py
87+
```
88+
89+
**Before running**, update:
90+
- `endpoint`: Your model deployment endpoint URL (e.g., `https://<deployment-id>.modeldeployment.<region>.oci.customer-oci.com/v1/your/endpoint`)
91+
- `body`: Your inference request payload
92+
93+
#### Streaming Inference
94+
95+
Use `inference_mie_streaming.py` for Server-Sent Events (SSE) streaming inference:
96+
97+
```python
98+
python inference_mie_streaming.py
99+
```
100+
Lines changed: 83 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,83 @@
1+
# This is an automatically generated code sample.
2+
# To make this code sample work in your Oracle Cloud tenancy,
3+
# please replace the values for any parameters whose current values do not fit
4+
# your use case (such as resource IDs, strings containing ‘EXAMPLE’ or ‘unique_id’, and
5+
# boolean, number, and enum parameters with values not fitting your use case).
6+
7+
from http import HTTPMethod
8+
import oci
9+
import sys
10+
11+
print(oci.__version__)
12+
# sys.exit()
13+
14+
# Create a default config using DEFAULT profile in default location
15+
# Refer to
16+
# https://docs.cloud.oracle.com/en-us/iaas/Content/API/Concepts/sdkconfig.htm#SDK_and_CLI_Configuration_File
17+
# for more info
18+
# config = oci.config.from_file()
19+
20+
# Custom signer
21+
config = oci.config.from_file(profile_name='default')
22+
token_file = config['security_token_file']
23+
token = None
24+
with open(token_file, 'r') as f:
25+
token = f.read()
26+
private_key = oci.signer.load_private_key_from_file(config['key_file'])
27+
28+
signer = oci.auth.signers.SecurityTokenSigner(token, private_key)
29+
30+
31+
# Initialize service client with default config file
32+
data_science_client = oci.data_science.DataScienceClient(config, signer=signer)
33+
34+
35+
# Send the request to service, some parameters are not required, see API
36+
# doc for more info
37+
create_model_deployment_response = data_science_client.create_model_deployment(
38+
create_model_deployment_details=oci.data_science.models.CreateModelDeploymentDetails(
39+
project_id="ocid1.datascienceproject.oc1.iad.amaaaaaav66vvniaqsyu2nufljutkn4rzth2nz4q3zqslirct7eayl5ojpma",
40+
compartment_id="ocid1.tenancy.oc1..aaaaaaaahzy3x4boh7ipxyft2rowu2xeglvanlfewudbnueugsieyuojkldq",
41+
model_deployment_configuration_details=oci.data_science.models.SingleModelDeploymentConfigurationDetails(
42+
deployment_type="SINGLE_MODEL",
43+
model_configuration_details=oci.data_science.models.ModelConfigurationDetails(
44+
model_id="ocid1.datasciencemodel.oc1.iad.amaaaaaav66vvnia36wjp3kb542uicbssflwid55wqf6zzzbxk5ekhhja4eq",
45+
instance_configuration=oci.data_science.models.InstanceConfiguration(
46+
instance_shape_name="VM.Standard.E4.Flex",
47+
model_deployment_instance_shape_config_details=oci.data_science.models.ModelDeploymentInstanceShapeConfigDetails(
48+
ocpus=1,
49+
memory_in_gbs=16,
50+
cpu_baseline="BASELINE_1_1"),
51+
),
52+
scaling_policy=oci.data_science.models.FixedSizeScalingPolicy(
53+
policy_type="FIXED_SIZE",
54+
instance_count=1
55+
),
56+
bandwidth_mbps=10,
57+
),
58+
environment_configuration_details=oci.data_science.models.OcirModelDeploymentEnvironmentConfigurationDetails(
59+
environment_configuration_type="OCIR_CONTAINER",
60+
server_port=8000,
61+
health_check_port=8000,
62+
predict_api_specification="openai",
63+
custom_http_endpoints=[
64+
oci.data_science.models.InferenceHttpEndpoint(
65+
endpoint_uri_suffix = "/v1/custom/completions",
66+
http_methods = [
67+
HTTPMethod.GET,
68+
HTTPMethod.POST
69+
],
70+
)
71+
],
72+
image='iad.ocir.io/ociodscdev/dsmc/inferencing/odsc-vllm-serving:query-path-test-1',
73+
image_digest='sha256:6522a4728d8030f97a686b46c416797f7106ccef7e2687d6fa9b594e41809200',
74+
entrypoint=[],
75+
environment_variables={
76+
'EXAMPLE_KEY_eSrrI': 'EXAMPLE_VALUE_K27qWX7kLrt6ZTDIGtdJ'})),
77+
display_name="MIE-Python-SDK-Text",
78+
description="Testing preview python sdk for MIE",
79+
),
80+
)
81+
82+
# Get the data from response
83+
print(create_model_deployment_response.data)
Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
# The OCI SDK must be installed for this example to function properly.
2+
# Installation instructions can be found here: https://docs.oracle.com/en-us/iaas/Content/API/SDKDocs/pythonsdk.htm
3+
4+
import requests
5+
import oci
6+
from oci.signer import Signer
7+
from oci.config import from_file
8+
9+
config = from_file('~/.oci/config')
10+
auth = Signer(
11+
tenancy=config['tenancy'],
12+
user=config['user'],
13+
fingerprint=config['fingerprint'],
14+
private_key_file_location=config['key_file'],
15+
pass_phrase=config['pass_phrase']
16+
)
17+
18+
# For security token based authentication
19+
# token_file = config['security_token_file']
20+
# token = None
21+
# with open(token_file, 'r') as f:
22+
# token = f.read()
23+
# private_key = oci.signer.load_private_key_from_file(config['key_file'])
24+
# auth = oci.auth.signers.SecurityTokenSigner(token, private_key)
25+
26+
endpoint = "<your-endpoint>"
27+
# Use your appropriate body here
28+
body = {
29+
"model":"odsc-llm",
30+
"prompt":"Who invented Internet",
31+
"max_tokens":5,
32+
"stream": False
33+
}
34+
35+
headers={'Content-Type':'application/json', 'Accept': 'application/json'}
36+
response = requests.post(endpoint, json=body, auth=auth, headers=headers)
37+
38+
print(response.headers)
39+
print(response.json())
Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
# The OCI SDK must be installed for this example to function properly.
2+
# Installation instructions can be found here: https://docs.oracle.com/en-us/iaas/Content/API/SDKDocs/pythonsdk.htm
3+
4+
import requests
5+
import oci
6+
from oci.signer import Signer
7+
from oci.config import from_file
8+
import sseclient # pip install sseclient-py
9+
10+
config = from_file('~/.oci/config')
11+
auth = Signer(
12+
tenancy=config['tenancy'],
13+
user=config['user'],
14+
fingerprint=config['fingerprint'],
15+
private_key_file_location=config['key_file'],
16+
pass_phrase=config['pass_phrase']
17+
)
18+
19+
# For security token based authentication
20+
# token_file = config['security_token_file']
21+
# token = None
22+
# with open(token_file, 'r') as f:
23+
# token = f.read()
24+
# private_key = oci.signer.load_private_key_from_file(config['key_file'])
25+
# auth = oci.auth.signers.SecurityTokenSigner(token, private_key)
26+
27+
endpoint = "<your-endpoint>"
28+
# Use your appropriate body here
29+
body = {
30+
"model":"odsc-llm",
31+
"prompt":"Who invented Internet",
32+
"max_tokens":5,
33+
"stream": True
34+
}
35+
36+
headers={'Content-Type':'application/json', 'Accept': 'text/event-stream'}
37+
response = requests.post(endpoint, json=body, auth=auth, stream=True, headers=headers)
38+
39+
print(response.headers)
40+
41+
client = sseclient.SSEClient(response)
42+
for event in client.events():
43+
print(event.data)
44+
45+
# Alternatively, we can use the below code to print the response.
46+
# for line in response.iter_lines():
47+
# if line:
48+
# print(line)
Lines changed: 68 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,68 @@
1+
# This is an automatically generated code sample.
2+
# To make this code sample work in your Oracle Cloud tenancy,
3+
# please replace the values for any parameters whose current values do not fit
4+
# your use case (such as resource IDs, strings containing ‘EXAMPLE’ or ‘unique_id’, and
5+
# boolean, number, and enum parameters with values not fitting your use case).
6+
7+
import oci
8+
from http import HTTPMethod
9+
10+
print(oci.__version__)
11+
# sys.exit()
12+
13+
# Create a default config using DEFAULT profile in default location
14+
# Refer to
15+
# https://docs.cloud.oracle.com/en-us/iaas/Content/API/Concepts/sdkconfig.htm#SDK_and_CLI_Configuration_File
16+
# for more info
17+
# config = oci.config.from_file()
18+
19+
# Custom signer
20+
config = oci.config.from_file(profile_name='default')
21+
token_file = config['security_token_file']
22+
token = None
23+
with open(token_file, 'r') as f:
24+
token = f.read()
25+
private_key = oci.signer.load_private_key_from_file(config['key_file'])
26+
27+
signer = oci.auth.signers.SecurityTokenSigner(token, private_key)
28+
29+
30+
# Initialize service client with default config file
31+
data_science_client = oci.data_science.DataScienceClient(config, signer=signer)
32+
33+
34+
# Send the request to service, some parameters are not required, see API
35+
# doc for more info
36+
update_model_deployment_response = data_science_client.update_model_deployment(
37+
model_deployment_id="ocid1.datasciencemodeldeployment.oc1.iad.amaaaaaav66vvniawqof2ppnqeocdyrkqto2ntkem2cbrm2fqebfx7cqmysq",
38+
update_model_deployment_details=oci.data_science.models.UpdateModelDeploymentDetails(
39+
display_name="MIE-Python-SDK-Text-Update",
40+
description="Sample update for python sdk for MIE",
41+
model_deployment_configuration_details=oci.data_science.models.UpdateSingleModelDeploymentConfigurationDetails(
42+
deployment_type="SINGLE_MODEL",
43+
environment_configuration_details=oci.data_science.models.UpdateOcirModelDeploymentEnvironmentConfigurationDetails(
44+
environment_configuration_type="OCIR_CONTAINER",
45+
server_port=8000,
46+
health_check_port=8000,
47+
predict_api_specification="openai",
48+
custom_http_endpoints=[
49+
oci.data_science.models.InferenceHttpEndpoint(
50+
endpoint_uri_suffix = "/v1/custom/completions",
51+
http_methods = [
52+
HTTPMethod.GET,
53+
HTTPMethod.POST
54+
],
55+
)
56+
],
57+
image='iad.ocir.io/ociodscdev/dsmc/inferencing/odsc-vllm-serving:query-path-test-1',
58+
image_digest='sha256:6522a4728d8030f97a686b46c416797f7106ccef7e2687d6fa9b594e41809200',
59+
entrypoint=[],
60+
environment_variables={
61+
'EXAMPLE_KEY_eSrrI': 'EXAMPLE_VALUE_K27qWX7kLrt6ZTDIGtdJ'}
62+
)
63+
),
64+
)
65+
)
66+
67+
# Get the data from response
68+
print(update_model_deployment_response.headers)

0 commit comments

Comments
 (0)