Skip to content

[DOCS] KubeRay APIServer V2 document #3594

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 27 commits into
base: master
Choose a base branch
from

Conversation

machichima
Copy link
Contributor

@machichima machichima commented May 13, 2025

  • Add README describe when to use APIServer, and as an entry point to all other docs
  • Add installation guide to install and start APIServer SDK with helm (currently just a fake one as we do not have helm chart for it yet)
  • Add examples to create/delete/modify RayCluster, RayJob, and RayService

Why are these changes needed?

Adding documents and examples for user to easily follow when using KubeRay APIServer SDK

Related issue number

Checks

  • I've made sure the tests are passing.
  • Testing Strategy
    • Unit tests
    • Manual tests
    • This PR is not tested :(

Comment on lines 44 to 46
curl -X POST 'localhost:31888/apis/ray.io/v1/namespaces/default/rayclusters' \
--header 'Content-Type: application/json' \
--data @docs/api-example/raycluster.json
Copy link
Contributor Author

@machichima machichima May 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I directly use link to the json file instead of the path here (see below)? Like what Kuberay docs did.

However, this will make this example currently not available to execute, as the file link does not exists yet.

curl -s https://raw.githubusercontent.com/ray-project/kuberay/v1.4.0/apiserversdk/docs/api-example/raycluster.json | \
curl -X POST 'localhost:31888/apis/ray.io/v1/namespaces/default/rayclusters' \
    --header 'Content-Type: application/json' \
    --data-binary @-

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as the file link does not exists yet

You can use a commit link?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I directly use link to the json file instead of the path here (see below)? Like what Kuberay docs did.

Current command looks good to me

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to avoid maintaining new JSON files if possible.

Can we just use the existing yamls https://github.com/ray-project/kuberay/tree/master/ray-operator/config/samples?

curl -s https://raw.githubusercontent.com/ray-project/kuberay/master/ray-operator/config/samples/....yaml | \
curl -X POST localhost:31888/apis/ray.io/v1/namespaces/default/rayclusters \
  -H "Content-Type: application/yaml" \
  --data-binary @-

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @rueian

Sure! However, for rayJob, the yaml contains both ConfigMap and RayJob, which we need to use awk for splitting them. The command will be like so:

# For config map
curl -s https://raw.githubusercontent.com/ray-project/kuberay/v1.3.0/ray-operator/config/samples/ray-job.sample.yaml \
    | awk 'BEGIN{d=-1} /^---/{d++; next} d==0' \
    | kubectl apply -f -

# for rayjob
curl -s https://raw.githubusercontent.com/ray-project/kuberay/v1.3.0/ray-operator/config/samples/ray-job.sample.yaml \
  | awk 'BEGIN{d=-1} /^---/{d++; next} d==-1' \
  | curl -X POST http://localhost:31888/apis/ray.io/v1/namespaces/default/rayjobs \
    -H "Content-Type: application/yaml" \
    --data-binary @-

This makes the command more complex. Do you think this is ok?

Comment on lines 21 to 24
helm repo add kuberay https://ray-project.github.io/kuberay-helm/
helm repo update
# Install KubeRay APIServer SDK
helm install kuberay-apiserver-sdk kuberay/kuberay-apiserver-sdk --version 1.0.0
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is currently not working, while we do not have kuberay-apiserver-sdk helm chart yet. We can modify this afterwards

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we will use the same chart and the same container image. see #3603 (review)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it! Then I will use kuberay-apiserver directly then
Thanks for the information!

@machichima machichima marked this pull request as ready for review May 15, 2025 14:31
@machichima
Copy link
Contributor Author

Hi @dentiny ,

Would you mind taking a look at this? Thanks!

Copy link
Contributor

@dentiny dentiny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

quick first iteration, will continue later today.

# KubeRay APIServer SDK

The KubeRay APIServer SDK is the HTTP proxy to the Kubernetes API server with the same
interface. User can directly use Kubernetes OpenAPI Spec and CRD for create, delete, and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mention query as well? People usually say "CRUD" together

interface. User can directly use Kubernetes OpenAPI Spec and CRD for create, delete, and
update Ray resources. It contains following highlight features:

1. Enable creating ComputeTemplate, which support setting default values that can be used
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curious do we support compute template in v2 for now? I thought it's in stage-2

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we don't. Can we remove this statement for now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure! I will remove this


## When to use APIServer SDK

KubeRay APIServer SDK featured in simplify Ray resources management by hiding
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

features

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

simplifying

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think "hiding Kubernetes-specific details" isn't correct. Can we just remove it and only keep "You can consider using APIServer SDK if"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem!

- You want to interact with Ray clusters via HTTP/REST (e.g., from a UI, SDK, or CLI).
- Your team prefers a simplified, non-Kubernetes-specific API surface to manage resources
lifecycles.
- You want to create templates or defulat values to simplify the configuration setup.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

default

interface. User can directly use Kubernetes OpenAPI Spec and CRD for create, delete, and
update Ray resources. It contains following highlight features:

1. Enable creating ComputeTemplate, which support setting default values that can be used
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we don't. Can we remove this statement for now?


## When to use APIServer SDK

KubeRay APIServer SDK featured in simplify Ray resources management by hiding
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think "hiding Kubernetes-specific details" isn't correct. Can we just remove it and only keep "You can consider using APIServer SDK if"

Kubernetes-specific details. You can considering using APIServer SDK if:

- You want to interact with Ray clusters via HTTP/REST (e.g., from a UI, SDK, or CLI).
- Your team prefers a simplified, non-Kubernetes-specific API surface to manage resources
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Your team prefers a simplified, non-Kubernetes-specific API surface to manage resources

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is incorrect too.

For custom resources defined by CRDs (e.g. `RayCluster`, `RayJob`, etc.), the endpoint format is:

```sh
<baseURL>/apis/<group>/<version>/namespaces/<namespace>/<resourceType>/<resourceName>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<baseURL>/apis/<group>/<version>/namespaces/<namespace>/<resourceType>/<resourceName>
<baseURL>/apis/ray.io/v1/namespaces/<namespace>/<resourceType>/<resourceName>

Comment on lines 55 to 56
- `group` = `ray.io`
- `version` = `v1`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- `group` = `ray.io`
- `version` = `v1`

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just make it simpler by embedding the group and the version to the url.

Comment on lines 61 to 71
#### Core Kubernetes Resources

For built-in Kubernetes resources (e.g. `ConfigMap`), the endpoint format is:

```sh
<baseURL>/api/v1/namespaces/<namespace>/<resourceType>/<resourceName>
```

- `namespace`: The target namespace
- `resourceType`: Core resource type (e.g. `pods`, `configmaps`, `services`)
- `resourceName`: Name of the resource
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#### Core Kubernetes Resources
For built-in Kubernetes resources (e.g. `ConfigMap`), the endpoint format is:
```sh
<baseURL>/api/v1/namespaces/<namespace>/<resourceType>/<resourceName>
```
- `namespace`: The target namespace
- `resourceType`: Core resource type (e.g. `pods`, `configmaps`, `services`)
- `resourceName`: Name of the resource

We don't provide access to built-in Kubernetes resources by default.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it! Will remove this

@@ -0,0 +1,223 @@
# RayCluster QuickStart

RayService consists of a RayCluster and Ray Serve deployment graphs. It offers
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may I ask what's a "deployment graph"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's a DAG to describe the deployments. I grab the information from here:
https://docs.ray.io/en/latest/cluster/kubernetes/getting-started.html#custom-resource-definitions-crds

Copy link
Contributor

@dentiny dentiny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question on this PR's scope: this pr is titled "apiserver v2 doc", so I expect only see "apiserversdk" related changes, but not sure why we have ray job/service/cluster's change here?
I thought they're part of your another doc improve pr: #3564

Signed-off-by: machichima <[email protected]>
@machichima machichima requested a review from dentiny May 18, 2025 09:53
@rueian
Copy link
Contributor

rueian commented May 18, 2025

But for rayJob we need to use awk while it contains both RayJob and ConfigMap in a single yaml file

Hi @machichima,

Can we use https://github.com/ray-project/kuberay/blob/master/ray-operator/config/samples/ray-job.interactive-mode.yaml which has no config map.

Comment on lines 46 to 62
```sh
kubectl get rayclusters
# NAME DESIRED WORKERS AVAILABLE WORKERS CPUS MEMORY GPUS STATUS AGE
# raycluster-kuberay 1 2 3G 0 89s
```

The KubeRay operator detects the RayCluster object and starts your Ray cluster by creating head and worker pods. To view
Ray cluster’s pods, run the following command:

```sh
# View the pods in the RayCluster named "raycluster-kuberay"
kubectl get pods --selector=ray.io/cluster=raycluster-kuberay

# NAME READY STATUS RESTARTS AGE
# raycluster-kuberay-head-k7rlq 1/1 Running 0 56s
# raycluster-kuberay-workergroup-worker-65zl8 1/1 Running 0 56s
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we really need these kubectl details in this document. I think a curl http://localhost:31888/apis/ray.io/v1/namespaces/default/rayclusters/raycluster-kuberay example for checking the cluster status is enough.

For other kubectl operations, can we just refer them to the ray-operator document?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure! That seems better

I copied some of the content from kuberay operator docs to make the example more complete. I will try to make it minimal and focus only on apiserver curl

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @rueian
I updated the docs to make it use only curl commands and avoid kubectl. Please have a look and see if it looks good to you.
Thanks!

@dentiny
Copy link
Contributor

dentiny commented May 21, 2025

@machichima Hmm for these doc related changes, maybe next time we could get copilot / gpt to do a sanity check on spelling and wording.

@machichima
Copy link
Contributor Author

@machichima Hmm for these doc related changes, maybe next time we could get copilot / gpt to do a sanity check on spelling and wording.

Thanks for the suggestion! I just do the proof reading by copilot to fix the wording issues.
They should be fine now

Copy link
Contributor

@dentiny dentiny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Copy link
Contributor

@rueian rueian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dentiny
Copy link
Contributor

dentiny commented May 23, 2025

@kevin85421 for a final look, the failed CI seems to be flaky test, and has nothing to do with this PR (which does no function change).

@kevin85421
Copy link
Member

kevin85421 commented May 24, 2025

In today's sync, I will review this after @troychiu's PR to support API server V2 merged.

@machichima
Copy link
Contributor Author

Added the following to help users follow the example without confusion:

  • Guidance to install APIServer without security proxy in Installation.md
  • Troubleshooting page to show how to get rid of "Unauthorized" when sending request with APIServer

Use the following command for port-forwarding to access the APIServer through port 31888:

```sh
kubectl port-forward service/kuberay-apiserver-service 31888:8888
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the port-forward here, which requires the changes in PR #3708

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants