Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTP 500 Error: NoSuchElementException When Adhoc Cluster is Busy #447

Open
alaturqua opened this issue Aug 27, 2024 · 2 comments
Open

HTTP 500 Error: NoSuchElementException When Adhoc Cluster is Busy #447

alaturqua opened this issue Aug 27, 2024 · 2 comments

Comments

@alaturqua
Copy link
Contributor

alaturqua commented Aug 27, 2024

Description:

We are encountering an issue with the Trino Gateway setup when querying multiple clusters. Below are the details of our current configuration and the problem:

Configuration:

  • Clusters:
    • 1x Cluster adhoc
    • 1x Cluster trino-etl
    • 2x Cluster looker
  • Routing Groups: Configured for each cluster as listed above.
  • Connection Method: JDBC Connection

Issue:
When the adhoc cluster becomes busy, jdbc connection queries for stats time out, and the Trino Gateway becomes unreachable.
The same thing happens, if we deactivate the adhoc cluster, while redeployment or restarts of trino cluster.

This results in the following error message:

HTTP ERROR 500 java.util.NoSuchElementException: No value present
URI: /v1/statement
STATUS: 500
MESSAGE: java.util.NoSuchElementException: No value present
SERVLET: trinoRouter
CAUSED BY: java.util.NoSuchElementException: No value present

Stack Trace:

java.util.NoSuchElementException: No value present
    at java.base/java.util.Optional.orElseThrow(Optional.java:377)
    at io.trino.gateway.ha.router.QueryCountBasedRouter.provideAdhocBackend(QueryCountBasedRouter.java:227)
    at io.trino.gateway.ha.handler.QueryIdCachingProxyHandler.getBackendFromRoutingGroup(QueryIdCachingProxyHandler.java:345)
    at io.trino.gateway.ha.handler.QueryIdCachingProxyHandler.rewriteTarget(QueryIdCachingProxyHandler.java:313)
    at io.trino.gateway.proxyserver.ProxyServletImpl.rewriteTarget(ProxyServletImpl.java:92)
    at org.eclipse.jetty.proxy.ProxyServlet.service(ProxyServlet.java:51)
    at jakarta.servlet.http.HttpServlet.service(HttpServlet.java:587)
    at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:764)
    at org.eclipse.jetty.servlet.ServletHandler$ChainEnd.doFilter(ServletHandler.java:1665)
    at io.trino.gateway.proxyserver.RequestFilter.doFilter(RequestFilter.java:40)
    at org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:202)
    at org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1635)
    at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:527)
    at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:221)
    at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1580)
    at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:221)
    at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1381)
    at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:176)
    at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:484)
    at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1553)
    at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:174)
    at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1303)
    at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:129)
    at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:122)
    at org.eclipse.jetty.proxy.ConnectHandler.handle(ConnectHandler.java:203)
    at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:122)
    at org.eclipse.jetty.server.Server.handle(Server.java:563)
    at org.eclipse.jetty.server.HttpChannel$RequestDispatchable.dispatch(HttpChannel.java:1598)
    at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:753)
    at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:501)
    at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:287)
    at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:314)
    at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:100)
    at org.eclipse.jetty.io.SelectableChannelEndPoint$1.run(SelectableChannelEndPoint.java:53)
    at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.runTask(AdaptiveExecutionStrategy.java:421)
    at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.consumeTask(AdaptiveExecutionStrategy.java:390)
    at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.tryProduce(AdaptiveExecutionStrategy.java:277)
    at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.run(AdaptiveExecutionStrategy.java:199)
    at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:411)
    at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:969)
    at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.doRunJob(QueuedThreadPool.java:1194)
    at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1149)
    at java.base/java.lang.Thread.run(Thread.java:1583)

Steps to Reproduce:

  1. Configure Trino Gateway with the above-mentioned clusters and routing groups.
  2. Use JDBC to query the adhoc cluster.
  3. Observe the error when the adhoc cluster is busy.

Expected Behavior:
The Trino Gateway should handle busy clusters gracefully without causing a 500 error.

Actual Behavior:
The gateway becomes unreachable with a 500 error when the adhoc cluster is busy.

Environment:

  • Trino Gateway Version: 9
  • Jetty Version: 11.0.19
@mosabua
Copy link
Member

mosabua commented Aug 27, 2024

What do you mean by "handle busy clusters gracefully" .. there is no queue or so in Trino Gateway .. it just routes traffic to clusters. In this case if adhoc is busy and no other cluster is available for routing.. what should the Trino Gateway do?

@Chaho12
Copy link
Member

Chaho12 commented Aug 27, 2024

Hmm. i don't think we need a queue in Trino Gateway for now.
I shared in slack once, but i think we should improve the way gateway handles how we return routing failure (due to whatever reason).

As of now, it returns 500 error page which is not that kind/intuitive to user on what it means.

  • ex. when i run query in trino cli and intentionally block it at routing manager for certain routing group
trino> select 'isblocked?';
Error running command: Error starting query at http://localhost:8080/v1/statement returned an invalid response: JsonResponse{statusCode=500, statusMessage=Server Error, headers={cache-control=[must-revalidate,no-cache,no-store], content-length=[372], content-type=[text/html;charset=iso-8859-1], date=[Tue, 06 Aug 2024 12:57:05 GMT]}, hasValue=false} [Error: <html>
<head>
<meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1"/>
<title>Error 500 Request failed.</title>
</head>
<body>
<h2>HTTP ERROR 500 Request failed.</h2>
<table>
<tr><th>URI:</th><td>http://localhost:8080/v1/statement</td></tr>
<tr><th>STATUS:</th><td>500</td></tr>
<tr><th>MESSAGE:</th><td>Request failed.</td></tr>
</table>

</body>
</html>
]
trino> 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants