-
Notifications
You must be signed in to change notification settings - Fork 16
feat: Add proxy for meilisearch host when ran out of the cluster. #156
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add proxy for meilisearch host when ran out of the cluster. #156
Conversation
Thanks for the pull request, @angonz! This repository is currently maintained by Once you've gone through the following steps feel free to tag them in a comment and let them know that your changes are ready for engineering review. 🔘 Get product approvalIf you haven't already, check this list to see if your contribution needs to go through the product review process.
🔘 Provide contextTo help your reviewers and other members of the community understand the purpose and larger context of your changes, feel free to add as much of the following information to the PR description as you can:
🔘 Get a green buildIf one or more checks are failing, continue working on your changes until this is no longer the case and your build turns green. Where can I find more information?If you'd like to get more details on all aspects of the review process for open source pull requests (OSPRs), check out the following resources: When can I expect my changes to be merged?Our goal is to get community contributions seen and reviewed as efficiently as possible. However, the amount of time that it takes to review and merge a PR can vary significantly based on factors such as:
💡 As a result it may take up to several weeks or months to complete a review and merge your PR. |
What's the advantage of using a proxy server over just setting the Also, since Meilisearch is so light on resources, we've been recommending one Meilisearch instance per Open edX instance. I'm curious about your reasons for recommending one large "shared" Meilisearch instance? I know both options are supported, just looking for reasons one way or the other. |
Sure! We have a couple of design principles:
We have ES as a service from AWS, but they don't offer Meilisearch, so we installed a stand-alone instance. As it is not a critical data store (indexes can be recreated if lost), we don't care about HA. |
More modern search engines like Meilisearch and TypeSense are a newer generation from the Elasticsearch generation, and the modern ones are highly optimized for speed and "search as you type" results. One of the ways they achieve that is by having the user's browser retrieve results directly from the search engine as they type each character of the search query, rather than routing the search request through a proxy. With Elasticsearch we didn't have much choice, and it was necessary to proxy the user's search requests through the LMS server in order to enforce permissions (don't allow users to search courses that they don't have access to). This could actually place a significant load on the With Meilisearch, we've been able to make the search engine itself able to enforce permissions, so that each user can connect directly to the search index and still only query course content that they have access to. (At least for the CMS; we have yet to develop a more sophisticated LMS search.) This means that we can follow the best practice of making the Meilisearch server public and avoid using the LMS/CMS as a proxy. You do still need some kind of load balancer or proxy to handle HTTPS because Meilisearch doesn't provide HTTPS, only HTTP. But it can be a very simple proxy like Caddy that just handles HTTPS and it doesn't need to do any other request filtering.
I would strongly recommend using a simple proxy like AWS ELB, Caddy, or nginx (see Meilisearch docs) rather than using the LMS as a proxy, because there is no need to tie up LMS resources / worker processes for simple proxying. I guess that's what this PR is doing though right? |
Thanks for the explanation!! Looks like it was a good decision to move to Meilisearch.
I see your point, but if you run Meilisearch in a pod you will also use the ingress and Caddy to proxy. It's just the same, but with external Meilisearch server. Anyway, I can do this in a separate plugin. |
Yes, that's reasonable. What I was advising against is using the edx-platform/edxapp Django app as a proxy like this. It's totally fine to use Caddy as a proxy. And I'm actually fine with merging this PR as is. Could you maybe just add some more comments explaining how this setup differs from the default? And confirm you've tested this? I also wonder if we need to consider the case of people who use Meilisearch Cloud because they'll have |
Sure. To summarize:
|
That's perfect - thanks! But can you move that info into the Configuration Reference section of the README as part of this PR? I don't think too many people will see it if it's just a comment on this PR. Then I'll merge it. |
b1092b0
to
2e4ad9a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks so much, especially for the nice documentation write-up!
Hi all,
I would like to propose this addition.
Multi-site K8s deployments might benefit from having one single Meilisearch server, installed out of the K8s cluster (which I would recommend).
However, the ingress will direct the MAILISEARCH_HOST to Caddy.
In this case a proxy for each site to the server might be useful.