Description
Describe the bug
During concurrent parralel http requests of type CREATE and DELETE on the catalog and principal entity, while listing catalogs or principals, I am getting a NullPointerException.
Polaris logs show the following:
INFO [org.apa.pol.ser.exc.IcebergExceptionMapper] [,POLARIS] [,,,] (executor-thread-54) Handling runtimeException null
INFO [org.apa.pol.ser.exc.IcebergExceptionMapper] [,POLARIS] [,,,] (executor-thread-54) Full RuntimeException: java.lang.NullPointerException
ERROR [org.apa.pol.ser.exc.IcebergExceptionMapper] [,POLARIS] [,,,] (executor-thread-54) Unhandled exception returning INTERNAL_SERVER_ERROR: java.lang.NullPointerException
The returned http status code is 500 and the body for the catalog entity contains the following:
{
"error":
{
"message":null,
"type":"NullPointerException",
"code":500
}
}
For the principal entity, the body contains:
{
"error":
{
"message":"Cannot invoke "org.apache.polaris.core.entity.PolarisBaseEntity getCatalogId()" because "sourceEntity" is null",
"type":"NullPointerException",
"code":500
}
}
Stack trace:
ERROR [org.apa.pol.ser.exc.IcebergExceptionMapper] [,POLARIS] [,,,] (executor-thread-52) Unhandled exception returning INTERNAL_SERVER_ERROR: java.lang.NullPointerException: Cannot invoke "org.apache.polaris.core.entity.PolarisBaseEntity.getCatalogId()" because "sourceEntity" is null
at org.apache.polaris.core.entity.PolarisEntity.<init>(PolarisEntity.java:187)
at org.apache.polaris.core.entity.PrincipalEntity.<init>(PrincipalEntity.java:26)
at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:197)
at java.base/java.util.AbstractList$RandomAccessSpliterator.forEachRemaining(AbstractList.java:722)
at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509)
at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499)
at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:575)
at java.base/java.util.stream.AbstractPipeline.evaluateToArrayNode(AbstractPipeline.java:260)
at java.base/java.util.stream.ReferencePipeline.toArray(ReferencePipeline.java:616)
at java.base/java.util.stream.ReferencePipeline.toArray(ReferencePipeline.java:622)
at java.base/java.util.stream.ReferencePipeline.toList(ReferencePipeline.java:627)
at org.apache.polaris.service.admin.PolarisServiceImpl.listPrincipals(PolarisServiceImpl.java:256)
at org.apache.polaris.service.admin.PolarisServiceImpl_ClientProxy.listPrincipals(Unknown Source)
at org.apache.polaris.service.admin.api.PolarisPrincipalsApi.listPrincipals(PolarisPrincipalsApi.java:232)
at org.apache.polaris.service.admin.api.PolarisPrincipalsApi_Subclass.listPrincipals$$superforward(Unknown Source)
at org.apache.polaris.service.admin.api.PolarisPrincipalsApi_Subclass$$function$$6.apply(Unknown Source)
at io.quarkus.arc.impl.AroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:73)
at io.quarkus.arc.impl.AroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:62)
at io.quarkus.micrometer.runtime.MicrometerTimedInterceptor.timedMethod(MicrometerTimedInterceptor.java:79)
at io.quarkus.micrometer.runtime.MicrometerTimedInterceptor_Bean.intercept(Unknown Source)
at io.quarkus.arc.impl.InterceptorInvocation.invoke(InterceptorInvocation.java:42)
at io.quarkus.arc.impl.AroundInvokeInvocationContext.perform(AroundInvokeInvocationContext.java:30)
at io.quarkus.arc.impl.InvocationContexts.performAroundInvoke(InvocationContexts.java:27)
at org.apache.polaris.service.admin.api.PolarisPrincipalsApi_Subclass.listPrincipals(Unknown Source)
at org.apache.polaris.service.admin.api.PolarisPrincipalsApi$quarkusrestinvoker$listPrincipals_8247ae723efb90ecd5dc9ca10b28b13ed5c10c1d.invoke(Unknown Source)
at org.jboss.resteasy.reactive.server.handlers.InvocationHandler.handle(InvocationHandler.java:29)
at io.quarkus.resteasy.reactive.server.runtime.QuarkusResteasyReactiveRequestContext.invokeHandler(QuarkusResteasyReactiveRequestContext.java:141)
at org.jboss.resteasy.reactive.common.core.AbstractResteasyReactiveContext.run(AbstractResteasyReactiveContext.java:147)
at io.quarkus.vertx.core.runtime.VertxCoreRecorder$15.runWith(VertxCoreRecorder.java:638)
at org.jboss.threads.EnhancedQueueExecutor$Task.doRunWith(EnhancedQueueExecutor.java:2675)
at org.jboss.threads.EnhancedQueueExecutor$Task.run(EnhancedQueueExecutor.java:2654)
at org.jboss.threads.EnhancedQueueExecutor.runThreadBody(EnhancedQueueExecutor.java:1627)
at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.run(EnhancedQueueExecutor.java:1594)
at org.jboss.threads.DelegatingRunnable.run(DelegatingRunnable.java:11)
at org.jboss.threads.ThreadLocalResettingRunnable.run(ThreadLocalResettingRunnable.java:11)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Thread.java:1583)
To Reproduce
Setup a test with two threads:
- Thread A - This thread performs the following operations repeatedly:
- Creates a new catalog or principal
- Immediately deletes the catalog or principal after creation
- Thread B - This thread repeatedly invokes the list operation for catalogs or principals.
Run both threads simultaneously for approximately 10 seconds.
During this time, you may observe occasional failures in the listing operations. Specifically, the GET /catalogs
or GET /principals
endpoints sometimes throw a NullPointerException. However, the issue is not consistently reproducible, and follows no specific pattern.
These are the endpoints I used for the test, with the host and resource being http://localhost:8181/api/management/v1
:
- Create Catalog:
POST /catalogs
- Delete Catalog:
DELETE /catalogs/{catalogId}
- List Catalogs:
GET /catalogs
- Create Principal:
POST /principals
- Delete Principal:
DELETE /principals/{principalId}
- List Principals:
GET /principals
Actual Behavior
CREATE operation sometimes return a 500 error with a NullPointerException in the logs.
It is unclear to me why this is happening, but based on the stack trace, it seems to be related to the list operation trying to access a catalog or principal entity that is in the process of being created or deleted by another thread.
Expected Behavior
I would expect the list operation return the current state of catalogs or principal, even if there are concurrent operations happening. The list operation should be able to handle concurrent modifications gracefully.
Additional context
Test environment used the in-memory store.
System information
OS: MacOS Sonoma v15.4.1
Polaris Catalog Version: 1.0.0-incubating-SNAPSHOT
Object Storage: FILE