Skip to content

NUTCH-3115 Extend access to all filter plugin constructor call args to user's POJO in Arbitrary Indexer #856

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jul 15, 2025

Conversation

CatChullain
Copy link
Contributor

This is the update to the Arbitrary Indexing Filter plugin that lets it optionally pass all its own constructor args (NutchDocument, Parse, Text (from Hadoop), CrawlDatum, and Inlinks) to the user's POJO constructor for their custom processing. There's a small typo correction in a logging statement, too, that output a class name when it claimed to output a method name, included here at no extra cost! :)

@sebastian-nagel sebastian-nagel changed the title Nutch 3115 Extend access to all filter plugin constructor call args to user's POJO in Arbitrary Indexer NUTCH-3115 Extend access to all filter plugin constructor call args to user's POJO in Arbitrary Indexer Jul 12, 2025
Copy link
Contributor

@sebastian-nagel sebastian-nagel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 lgtm. Thanks, @CatChullain!

@CatChullain
Copy link
Contributor Author

Still banging on this today. If we still have time, I'd like to do a bit more work so the new flag for all fields access can apply to each field/class individually. That'll be easier to update an existing nutch-site to start using it without interfering with previous index-arbitrary defs.

Copy link
Contributor

@sebastian-nagel sebastian-nagel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @CatChullain! I'm going to merge the PR...

@sebastian-nagel sebastian-nagel merged commit e850012 into apache:master Jul 15, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants