Skip to content

feat(plugin): add ingestion plugin framework and example implementation #855

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

umeshdangat
Copy link
Member

  • Introduced AbstractIngestionPlugin base class to support ingestion plugins
  • Defined IngestionPlugin interface with start/stop lifecycle methods
  • Updated NrtsearchServer to initialize ingestion plugin state
  • Implemented ExamplePlugin as an ingestion + analysis plugin
  • Added test coverage for ingestion using Awaitility for async validation
  • Registered example plugin in build and settings
  • Bumped version to 1.0.0-beta.11

- Introduced AbstractIngestionPlugin base class to support ingestion plugins
- Defined IngestionPlugin interface with start/stop lifecycle methods
- Updated NrtsearchServer to initialize ingestion plugin state
- Implemented ExamplePlugin as an ingestion + analysis plugin
- Added test coverage for ingestion using Awaitility for async validation
- Registered example plugin in build and settings
- Bumped version to 1.0.0-beta.11
@umeshdangat umeshdangat requested a review from sarthakn7 April 30, 2025 17:12
…ugin-managed threading

- Updated AbstractIngestionPlugin to launch startIngestion asynchronously using plugin-provided ExecutorService
- Added getIngestionExecutor() abstract method for plugin-controlled threading
- Removed IngestionPlugin interface (now redundant)
- Updated ExamplePlugin to implement new lifecycle (NOTE: getIngestionExecutor must return a valid executor)
- Updated ExamplePluginTest to use initializeAndStartIngestion and Awaitility for async ingestion
- Updated NrtsearchServer to initialize ingestion plugins after startup
- Added getIngestionPluginConfigs() to NrtsearchConfig for plugin-specific config support
- Added unit test for ingestion plugin config parsing
@umeshdangat umeshdangat requested a review from sarthakn7 May 2, 2025 22:46
@@ -158,6 +159,7 @@ public void start() throws IOException {
GlobalState globalState = serverImpl.getGlobalState();

registerMetrics(globalState);
initIngestionPlugin(globalState, plugins);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the best place to do the initialization is https://github.com/Yelp/nrtsearch/blob/main/src/main/java/com/yelp/nrtsearch/server/grpc/NrtsearchServer.java#L387 after the global state is constructed. Doing this in LuceneServerImpl means it will work with the test server classes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

* implementations: 1. Must implement startIngestion/stopIngestion to handle source-specific logic
* 2. Should use addDocuments/commit methods to index data during ingestion
*/
public abstract class AbstractIngestionPlugin extends Plugin {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would advise against making the plugin an abstract class. The Plugin is designed to potentially provide a lot of different functionality. Forcing it to contain the ingestion implementation would be odd.

I would instead suggest something more like:

interface IngestionPlugin extends Plugin {
  AbstractIngestor getIngestor()
}

And putting the abstract class in com.yelp.nrtsearch.server.ingestion

You could also use an Ingestor interface instead, and make AbstractIngestor implement that.

Copy link
Member Author

@umeshdangat umeshdangat May 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would advise against making the plugin an abstract class.

Not sure I understand. I have not changed anything in the existing Plugin class?

Forcing it to contain the ingestion implementation would be odd.

I created a new AbstractIngestionPlugin that extends the existing Plugin class. This is similar to what you suggesting?
interface IngestionPlugin extends Plugin
vs
abstract class AbstractIngestionPlugin extends Plugin ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using an interface makes code organization cleaner. Otherwise, the ingestion implementation is forced to reside in the main plugin class.

This is also a pattern that cannot be repeated for other plugin types, since a class can implement many interfaces, but extend only one base class. Using abstract classes would unnecessarily limit what functionality could be colocated in a single plugin. For flexibility, having all plugins types being interfaces is better.

Copy link
Member Author

@umeshdangat umeshdangat May 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Plugin class is already an abstract class NOT an interface: https://github.com/Yelp/nrtsearch/blob/main/src/main/java/com/yelp/nrtsearch/server/plugins/Plugin.java#L22

To do what you suggest

interface IngestionPlugin extends Plugin {      **//interface cannot extend abstract class**
  AbstractIngestor getIngestor()
}

I am going to have to change existing Plugin abstract class to an interface with something like

public interface Plugin extends Closeable {
  @Override
  default void close() throws IOException {}
}

This then means I have to change current implementations of Plugins (including test plugins)

from extends Plugin to implements Plugin (places of usage ) mainly because they already implement an interface (so its mostly syntactic only) e.g.

-  public static class TestTokenFilterDuplicateLucenePlugin extends Plugin
-      implements AnalysisPlugin {
+  public static class TestTokenFilterDuplicateLucenePlugin
+      implements Plugin, AnalysisPlugin {

Let me know if this is what you suggesting? It should be search and replace but will make files changed in this PR a lot more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, I can just not change any existing Plugin behavior and do something like so...

public interface IngestionPlugin {
  Ingestor getIngestor();
}

public abstract class AbstractIngestor implements Ingestor

public class ExampleIngestionPlugin extends Plugin implements IngestionPlugin

By doing the above I think we still manage to keep Plugin class untouched and introduce ingestion functionality in com.yelp.nrtsearch.server.ingestion

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added the above ^.

Comment on lines 441 to 450
Object raw = configReader.get("pluginConfigs.ingestion", obj -> obj);
if (raw instanceof Map<?, ?> outerMap) {
Map<String, Map<String, Object>> result = new HashMap<>();
for (Map.Entry<?, ?> entry : outerMap.entrySet()) {
if (entry.getKey() instanceof String pluginName
&& entry.getValue() instanceof Map<?, ?> pluginConfig) {
result.put(pluginName, (Map<String, Object>) pluginConfig);
}
}
return result;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could simply this code a bit using a jackson annotated class. You can load it like https://github.com/Yelp/nrtsearch/blob/main/src/main/java/com/yelp/nrtsearch/server/config/ThreadPoolConfiguration.java#L173

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -37,41 +40,80 @@
import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.TokenStream;
import org.apache.lucene.analysis.tokenattributes.CharTermAttribute;
import org.junit.After;
import org.junit.Before;
import org.junit.ClassRule;
import org.junit.Test;

public class ExamplePluginTest extends ServerTestCase {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason you are putting the tests in the example-plugin instead of the main testing?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed. Added relevant tests IngestionPluginUtilsTest, AbstractIngestorTest

…ugin changes

- Introduced ingestion plugin framework:
  - Added AbstractIngestor, Ingestor interface, and IngestionPluginUtils
  - Updated NrtsearchServer to initialize ingestion plugins
  - Added unit tests for ingestion plugin utilities

- Reverted changes to example-plugin:
  - Restored ExamplePlugin.java and ExamplePluginTest.java to original state from v1.0.0-beta.12
  - Removed async ingestion logic and executor service usage
  - Prevents dependency issues in Jenkins due to plugin build order

- Updated build.gradle and settings.gradle to support ingestion plugin framework
@umeshdangat umeshdangat force-pushed the umesh_add_support_for_ingestion_plugin branch from a29d47d to 0d6412c Compare May 14, 2025 00:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants