NeoLuke: The next generation of Luke for Lucene.NET #1210
paulirwin
started this conversation in
Show and tell
Replies: 2 comments
-
|
That is awesome, not tested yet, but it looks great. |
Beta Was this translation helpful? Give feedback.
0 replies
-
|
Already started using it, great! |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I did a thing over the weekend, you can check it out here: https://github.com/paulirwin/neoluke
(Note: this is a personal side project on my personal GitHub, and not an official artifact of Apache Lucene.NET.)
I have been a user of the Luke toolkit app for Lucene for many years. I often use it to ensure compatibility between Lucene and Lucene.NET, and perhaps most often to kick the tires on different analyzers and see what tokens they emit to debug issues either I am having myself or other people are having on GitHub issues. Unfortunately, to use a Lucene 4.8 index, you have to use a pretty old version of Luke, and its old Java Swing UI is pretty rough nowadays on high-resolution modern OSes.
I have tried to create my own version of "Luke.NET" no less than three times... never got around to finishing the project any of those times. (You can see one failed start here from 12 years ago.) Thankfully, this time I have Claude Code to help do a lot of the grunt work and save my wrists. So, this weekend I decided to just get it done!
Before I get into the details, I want to give a shout-out to two other .NET ports of Luke:
Both of those use WinForms, which won't work on my macOS laptop, and both target the older versions of Lucene.NET. So while possibly they could be updated to use some cross-platform GUI framework on modern .NET and target Lucene.NET 4.8, I decided that would likely be too much work, and just started from scratch.
NeoLuke is the next generation of a Lucene.NET port of Luke. It is fully cross-platform, and tested to work on macOS, Linux (Ubuntu, at least), and Windows. It uses Avalonia for a WPF-like GUI. It even supports dark mode! I mostly tracked the latest Lucene trunk (v11.0) Luke UI, although I deviated in a few ways that I felt were awkward and could be improved. It is built using the .NET 10 RC SDK, which will be GA in just a few short weeks.
It also has a decent bit of unit and integration test coverage. The integration tests use Avalonia's headless testing mode, so they are run in the GitHub CI pipelines as well on all three OSes. The repo also includes a demo index generator that uses Bogus to create random data. This is handy if you don't already have an index to test it with. I might add some additional generators to pull publicly-available data too.
Almost all of the Luke v11 functionality is implemented, where possible with Lucene.NET 4.8. Some things like adding documents and More Like This are a little rough around the edges still in the UI, but browsing the index term/document data, searching, check index, optimize index, export terms, viewing segment files, and using the analysis tab all work great.
In particular the feature I use most in Luke (and now NeoLuke) is Analysis, as I like seeing how terms get stemmed or ignored by the analyzer. NeoLuke allows you to easily select from 45 (!) included analyzers in Lucene.NET for either searching or token analysis! (It's fun playing around with them and seeing that e.g. the SpanishAnalyzer emits just "dos" for "¿Por qué no los dos?" as input.) This is also handy on the Search tab to see how the Query Parser + Analyzer come together to parse your query text.
The following analyzers are not yet supported because their constructors are parameterized and I need to add support for that:
Additionally, some analyzers like StopAnalyzer currently load and show in the list but do not work; I'll be working on fixing those. I also hope to add support for building a custom analyzer like you can in Luke with built-in components, as well as maybe even allowing dynamic external assembly loading to load in your custom analyzer and/or components from a DLL.
Also, a natural warning: don't use this on production indexes without taking a backup first! This app hasn't been thoroughly tested in production scenarios, especially for index modification/optimization.
It is possible that NeoLuke could form the basis (or even the direct implementation) of a .NET port of Luke for a future version of Lucene.NET. Lucene added Luke to their repo in Lucene 8.1, so once we catch up to that version, it would be good for us to have a port, and I'm open to NeoLuke becoming that port if the community is interested in that. In the meantime, this can serve as an external, experimental tool to determine whether that has value.
I welcome feedback and any issues for bug reports or feature requests you might have. Thanks, and enjoy!
Beta Was this translation helpful? Give feedback.
All reactions