-
Notifications
You must be signed in to change notification settings - Fork 497
Adding the complete architecture for search benchmarking #2740
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
test/search/real_search.jl
Outdated
end | ||
|
||
# Run the wrapper | ||
result = read(`node -e $wrapper_js`, String) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we already have a file, can't we just do node file.js
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but we're replacing the placeholders with their values here
wrapper_js = replace(wrapper_js, "__SEARCH_INDEX__" => JSON.json(search_index_data))
wrapper_js = replace(wrapper_js, "__QUERY__" => "\"" * query * "\"")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of thoughts:
- We could add a
make search-benchmarks
command: https://github.com/JuliaDocs/Documenter.jl/blob/master/Makefile - We could add a simple CI job that runs these benchmarks on CI. It should be fine to just copy-paste a thing here for now: https://github.com/JuliaDocs/Documenter.jl/blob/master/.github/workflows/CI.yml
- It would be good to have a more realistic reference manual / search index we test against. Options: Documenter's manual or Julia manual. As we discussed on the call, let's start with Documenter.
test/search/wrapper.js
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This duplicates the search.js
, so it would be good to somehow avoid the duplication here by reusing the search.js
in the tests. It would likely require some changes on how we handle the search.js
file.
Documenter.jl/assets/html/js/search.js
Lines 189 to 216 in 022c013
let index = new MiniSearch({ | |
fields: ["title", "text"], // fields to index for full-text search | |
storeFields: ["location", "title", "text", "category", "page"], // fields to return with results | |
processTerm: (term) => { | |
let word = stopWords.has(term) ? null : term; | |
if (word) { | |
// custom trimmer that doesn't strip @ and !, which are used in julia macro and function names | |
word = word | |
.replace(/^[^a-zA-Z0-9@!]+/, "") | |
.replace(/[^a-zA-Z0-9@!]+$/, ""); | |
word = word.toLowerCase(); | |
} | |
return word ?? null; | |
}, | |
// add . as a separator, because otherwise "title": "Documenter.Anchors.add!", would not | |
// find anything if searching for "add!", only for the entire qualification | |
tokenize: (string) => string.split(/[\s\-\.]+/), | |
// options which will be applied during the search | |
searchOptions: { | |
prefix: true, | |
boost: { title: 100 }, | |
fuzzy: 2, | |
}, | |
}); | |
index.addAll(data); |
…jl and ensured that the Node.js process runs in the correct directory
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I would be happy to merge this as the first iteration if we could get the duplication & version number sorted as well!
test/search/real_search.jl
Outdated
write(io, wrapper_js) | ||
close(io) | ||
cd(@__DIR__) do | ||
result = read(`node $path`, String) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we make this use the Node JLL? This means that the user wouldn't have to install Node just to run the benchmarks -- we can pull it in via the Julia package manager.
There's some example usage here for example: https://github.com/JuliaComputing/MultiDocumenter.jl/blob/e112da8b744f3393d037a7380e544797c2a41953/src/search/pagefind.jl#L31-L55
.github/workflows/CI.yml
Outdated
- uses: actions/setup-node@v4 | ||
with: | ||
node-version: '20.x' | ||
- name: Install Node.js dependencies | ||
run: npm install | ||
working-directory: test/search |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we switch to the Node JLL, we can remove these.
- uses: actions/setup-node@v4 | |
with: | |
node-version: '20.x' | |
- name: Install Node.js dependencies | |
run: npm install | |
working-directory: test/search |
…aceholder with its value
how's this implementation @mortenpi ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Rahban1 LGTM! I think we should work a bit on getting more test cases in etc. But I link this infra, so let's get this merged!
this closes #2417, this still has a lot of work to do, I am just putting it out there so everybody can see and give their suggestions on it. this is nowhere completed. I am going to work on it, maybe end up changing a substantial amount but right now it is just a proof of concept. would love to get some feedback on it. what do you think could be improved and what do you guys have in mind???
Thank you