Skip to content

NullPointerException During Concurrent Create/Delete and List Operations #1388

Open
@BenjaminSSL

Description

@BenjaminSSL

Describe the bug

During concurrent parralel http requests of type CREATE and DELETE on the catalog and principal entity, while listing catalogs or principals, I am getting a NullPointerException.

Polaris logs show the following:

INFO  [org.apa.pol.ser.exc.IcebergExceptionMapper] [,POLARIS] [,,,] (executor-thread-54) Handling runtimeException null
INFO  [org.apa.pol.ser.exc.IcebergExceptionMapper] [,POLARIS] [,,,] (executor-thread-54) Full RuntimeException: java.lang.NullPointerException
ERROR [org.apa.pol.ser.exc.IcebergExceptionMapper] [,POLARIS] [,,,] (executor-thread-54) Unhandled exception returning INTERNAL_SERVER_ERROR: java.lang.NullPointerException

The returned http status code is 500 and the body for the catalog entity contains the following:

{
   "error":
   {
      "message":null,
      "type":"NullPointerException",
      "code":500
   }
}

For the principal entity, the body contains:

{
   "error":
   {
      "message":"Cannot invoke "org.apache.polaris.core.entity.PolarisBaseEntity getCatalogId()" because "sourceEntity" is null",
      "type":"NullPointerException",
      "code":500
   }
}

Stack trace:

ERROR [org.apa.pol.ser.exc.IcebergExceptionMapper] [,POLARIS] [,,,] (executor-thread-52) Unhandled exception returning INTERNAL_SERVER_ERROR: java.lang.NullPointerException: Cannot invoke "org.apache.polaris.core.entity.PolarisBaseEntity.getCatalogId()" because "sourceEntity" is null
	at org.apache.polaris.core.entity.PolarisEntity.<init>(PolarisEntity.java:187)
	at org.apache.polaris.core.entity.PrincipalEntity.<init>(PrincipalEntity.java:26)
	at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:197)
	at java.base/java.util.AbstractList$RandomAccessSpliterator.forEachRemaining(AbstractList.java:722)
	at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509)
	at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499)
	at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:575)
	at java.base/java.util.stream.AbstractPipeline.evaluateToArrayNode(AbstractPipeline.java:260)
	at java.base/java.util.stream.ReferencePipeline.toArray(ReferencePipeline.java:616)
	at java.base/java.util.stream.ReferencePipeline.toArray(ReferencePipeline.java:622)
	at java.base/java.util.stream.ReferencePipeline.toList(ReferencePipeline.java:627)
	at org.apache.polaris.service.admin.PolarisServiceImpl.listPrincipals(PolarisServiceImpl.java:256)
	at org.apache.polaris.service.admin.PolarisServiceImpl_ClientProxy.listPrincipals(Unknown Source)
	at org.apache.polaris.service.admin.api.PolarisPrincipalsApi.listPrincipals(PolarisPrincipalsApi.java:232)
	at org.apache.polaris.service.admin.api.PolarisPrincipalsApi_Subclass.listPrincipals$$superforward(Unknown Source)
	at org.apache.polaris.service.admin.api.PolarisPrincipalsApi_Subclass$$function$$6.apply(Unknown Source)
	at io.quarkus.arc.impl.AroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:73)
	at io.quarkus.arc.impl.AroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:62)
	at io.quarkus.micrometer.runtime.MicrometerTimedInterceptor.timedMethod(MicrometerTimedInterceptor.java:79)
	at io.quarkus.micrometer.runtime.MicrometerTimedInterceptor_Bean.intercept(Unknown Source)
	at io.quarkus.arc.impl.InterceptorInvocation.invoke(InterceptorInvocation.java:42)
	at io.quarkus.arc.impl.AroundInvokeInvocationContext.perform(AroundInvokeInvocationContext.java:30)
	at io.quarkus.arc.impl.InvocationContexts.performAroundInvoke(InvocationContexts.java:27)
	at org.apache.polaris.service.admin.api.PolarisPrincipalsApi_Subclass.listPrincipals(Unknown Source)
	at org.apache.polaris.service.admin.api.PolarisPrincipalsApi$quarkusrestinvoker$listPrincipals_8247ae723efb90ecd5dc9ca10b28b13ed5c10c1d.invoke(Unknown Source)
	at org.jboss.resteasy.reactive.server.handlers.InvocationHandler.handle(InvocationHandler.java:29)
	at io.quarkus.resteasy.reactive.server.runtime.QuarkusResteasyReactiveRequestContext.invokeHandler(QuarkusResteasyReactiveRequestContext.java:141)
	at org.jboss.resteasy.reactive.common.core.AbstractResteasyReactiveContext.run(AbstractResteasyReactiveContext.java:147)
	at io.quarkus.vertx.core.runtime.VertxCoreRecorder$15.runWith(VertxCoreRecorder.java:638)
	at org.jboss.threads.EnhancedQueueExecutor$Task.doRunWith(EnhancedQueueExecutor.java:2675)
	at org.jboss.threads.EnhancedQueueExecutor$Task.run(EnhancedQueueExecutor.java:2654)
	at org.jboss.threads.EnhancedQueueExecutor.runThreadBody(EnhancedQueueExecutor.java:1627)
	at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.run(EnhancedQueueExecutor.java:1594)
	at org.jboss.threads.DelegatingRunnable.run(DelegatingRunnable.java:11)
	at org.jboss.threads.ThreadLocalResettingRunnable.run(ThreadLocalResettingRunnable.java:11)
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at java.base/java.lang.Thread.run(Thread.java:1583)

To Reproduce

Setup a test with two threads:

  1. Thread A - This thread performs the following operations repeatedly:
    • Creates a new catalog or principal
    • Immediately deletes the catalog or principal after creation
  2. Thread B - This thread repeatedly invokes the list operation for catalogs or principals.

Run both threads simultaneously for approximately 10 seconds.
During this time, you may observe occasional failures in the listing operations. Specifically, the GET /catalogs or GET /principals endpoints sometimes throw a NullPointerException. However, the issue is not consistently reproducible, and follows no specific pattern.

These are the endpoints I used for the test, with the host and resource being http://localhost:8181/api/management/v1:

  • Create Catalog: POST /catalogs
  • Delete Catalog: DELETE /catalogs/{catalogId}
  • List Catalogs: GET /catalogs
  • Create Principal: POST /principals
  • Delete Principal: DELETE /principals/{principalId}
  • List Principals: GET /principals

Actual Behavior

CREATE operation sometimes return a 500 error with a NullPointerException in the logs.

It is unclear to me why this is happening, but based on the stack trace, it seems to be related to the list operation trying to access a catalog or principal entity that is in the process of being created or deleted by another thread.

Expected Behavior

I would expect the list operation return the current state of catalogs or principal, even if there are concurrent operations happening. The list operation should be able to handle concurrent modifications gracefully.

Additional context

Test environment used the in-memory store.

System information

OS: MacOS Sonoma v15.4.1
Polaris Catalog Version: 1.0.0-incubating-SNAPSHOT
Object Storage: FILE

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workinggood first issueGood for newcomers

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions