Skip to content
This repository was archived by the owner on Mar 13, 2025. It is now read-only.
This repository was archived by the owner on Mar 13, 2025. It is now read-only.

serve status fails on the head pod after model is deployed #141

@viirya

Description

@viirya

I've deployed rayllm on K8s with amazon--LightGPT model. I can query it from local using curl.

But when I go to run serve status on the head pod, I got the following error:

serve config
Traceback (most recent call last):
  File "/home/ray/anaconda3/bin/serve", line 8, in <module>
    sys.exit(cli())
  File "/home/ray/anaconda3/lib/python3.9/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/home/ray/anaconda3/lib/python3.9/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
  File "/home/ray/anaconda3/lib/python3.9/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/ray/anaconda3/lib/python3.9/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/ray/anaconda3/lib/python3.9/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/home/ray/anaconda3/lib/python3.9/site-packages/ray/serve/scripts.py", line 554, in config
    **ServeSubmissionClient(address).get_serve_details()
  File "/home/ray/anaconda3/lib/python3.9/site-packages/ray/dashboard/modules/serve/sdk.py", line 61, in __init__
    self._check_connection_and_version_with_url(
  File "/home/ray/anaconda3/lib/python3.9/site-packages/ray/dashboard/modules/dashboard_sdk.py", line 267, in _check_connection_and_version_with_url
    r.raise_for_status()
  File "/home/ray/anaconda3/lib/python3.9/site-packages/requests/models.py", line 1021, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 503 Server Error: Service Unavailable for url: http://localhost:8265/api/ray/version

However, the service URL is working when I manually get something from it:

wget http://localhost:8265/api/ray/version
--2024-03-11 16:01:42--  http://localhost:8265/api/ray/version
Resolving localhost (localhost)... ::1, 127.0.0.1
Connecting to localhost (localhost)|::1|:8265... failed: Connection refused.
Connecting to localhost (localhost)|127.0.0.1|:8265... connected.
HTTP request sent, awaiting response... 200 OK

Is anything I miss?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions