-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update Dockerfile to Ubuntu 22.04 + integrate qlever script #1439
base: master
Are you sure you want to change the base?
Conversation
The old Dockerfile called `ServerMain` directly using a small selection of environment variables with outdated names. It was also outdated in other respects. The new Dockerfile installs the `qlever` script, so that it can be called from inside the container. Remaining questions / TODOs, feedback welcome: 1. Right now, the script is installed as part of the docker build via `pipx install qlever`. Is this the right way to do it? Alternatives would be to clone the GitHub repo and `pipx install -e .` from there, or include the the GitHub repo as a submodule of this repository. 2. How do we handle the QLever UI. We could just call `qlever ui` from inside the container. But that would pull the Docker image for the Qlever UI and run a Docker container inside of a Docker container. It's possible, but not the right way to do it. If both are needed, the container for the QLever engine and the contaner for the QLever UI should run side by side. 3. The`qlever setup-config` command should have options that overwrite the variables in the produced Qleverfile. In particular, there should be an option for setting `SYSTEM = native`. Otherwise it has to be stated explictly for each command, where that is relevant (in particular: `qlever index`, `qlever start`, `qlever example-queries`).
@ludovicm67 Can you try this and let me know your feedback? It addresses some of the issues that we have discussed. In particular, you can now use the If I understood you correctly, you would also like to run the QLever UI. This would also work here, but running a Docker container (for the QLever UI) inside of a Docker container is generally not a good idea. I don't have a solution yet, but here are some ideas:
|
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #1439 +/- ##
=======================================
Coverage 89.30% 89.30%
=======================================
Files 373 373
Lines 35377 35377
Branches 3999 3999
=======================================
+ Hits 31594 31595 +1
+ Misses 2493 2492 -1
Partials 1290 1290 ☔ View full report in Codecov by Sentry. |
@hannahbast Thanks for the updates!
I think it's great like this.
I think it's fine to run the QLever UI in a separate container.
Yes, I'm currently experimenting with environment variables and a simple shell script that generates the Qleverfile from them. |
Yes, I will give it a try soon and let you know my feedback.
Fully agree with you. It is better to have the QLever UI as a separate container.
I would not recommend this approach. It may lead to security issues, and I don't think it is a good practice.
This is a good approach. It's also the easiest way to deploy the full stack (backend and UI) for users who want to use both ; a simple
The issue with this is that you will run multiple processes at the same time (the backend and the UI) in a single container which is not recommended. In the future, the best approach would be to have two separate containers: one for the backend and one for the UI. And two variants: a minimal one, and a complete one, which will results in the following images:
That way, For people that want to have autoconfiguration, work with a Qleverfile, or have a shell to debug things in an easier way, they can use the complete images. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
During the build of the image, the following warnings are shown:
#1 WARN: FromAsCasing: 'as' and 'FROM' keywords' casing do not match (line 1)
#1 WARN: FromAsCasing: 'as' and 'FROM' keywords' casing do not match (line 9)
#1 WARN: FromAsCasing: 'as' and 'FROM' keywords' casing do not match (line 21)
I did some suggestions in order to fix the issue mentioned in the warning message.
Example of logs where the warning is visible: https://github.com/ad-freiburg/qlever/actions/runs/10345045337/job/28631489324#step:8:175
@ludovicm67 Thank you for your comments. I change all the
|
Perfect 👍
I'm not sure to see the problem here, as it's what is expected, no?
In both ; one will call the The other one will call As in both containers, the Qleverfile will be generated on the fly using the environment variables (or can be mounted directly). The You can additionally release a |
@ludovicm67 Just for clarification: Item 1 in my previous comment was not a problem (on the contrary), but just the first item on the list leading up to the problem description. Let me think about what you wrote and then come back to you. Having the |
@hannahbast basically what we need to have for the UI container image:
By default, it ships with a basic I suggested using the But I remain completely open for any other solution. |
@ludovicm67 Thanks a lot and the exchange is much appreciated. I will think about it and come back to you. |
ENV CACHE_MAX_NUM_ENTRIES 1000 | ||
# Need the shell to get the INDEX_PREFIX environment variable | ||
ENTRYPOINT ["/bin/sh", "-c", "exec ServerMain -i \"/index/${INDEX_PREFIX}\" -j 8 -m ${MEMORY_FOR_QUERIES} -c ${CACHE_MAX_SIZE_GB} -e ${CACHE_MAX_SIZE_GB_SINGLE_ENTRY} -k ${CACHE_MAX_NUM_ENTRIES} -p 7001 \"$@\"", "--"] | ||
ENTRYPOINT ["bash"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest using the default entrypoint from the Ubuntu Docker image (preferred):
ENTRYPOINT ["bash"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or explicitly set it to an empty one:
ENTRYPOINT ["bash"] | |
ENTRYPOINT [""] |
USER qlever | ||
ENV PATH=/app/:$PATH | ||
RUN PIPX_HOME=/qlever/.local PIPX_BIN_DIR=/qlever/.local/bin PIPX_MAN_DIR=/qlever/.local/share pipx install qlever |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something that might be good to do is to lock a specific version of the qlever
CLI, to avoid unwanted breaking changes if the image gets rebuilt.
First define an ARG
like this at the top of the Dockerfile (so that it's easy to know what to change in case of upgrade):
ARG QLEVER_VERSION="0.5.3"
Then you might need to tell the layer to use the QLEVER_VERSION
arg, by just adding:
ARG QLEVER_VERSION
in the layer.
And then, you can use it that way, if I'm not mistaking on how to specify a package version with pipx (I guess it should behave like pip):
RUN PIPX_HOME=/qlever/.local PIPX_BIN_DIR=/qlever/.local/bin PIPX_MAN_DIR=/qlever/.local/share pipx install qlever | |
RUN PIPX_HOME=/qlever/.local PIPX_BIN_DIR=/qlever/.local/bin PIPX_MAN_DIR=/qlever/.local/share pipx install "qlever==${QLEVER_VERSION}" |
You can take a look here for an example: https://github.com/zazukoians/qlever-tests/blob/76ada0b53174beb79d11ed14662536689a165fff/docker/server.Dockerfile
Quality Gate passedIssues Measures |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The image can be optimized for a smaller size. The 2 suggested changes are the low-hanging fruit and reduce the compressed image size by 40MB/15% (uncompressed: 100MB/12.5%). There is probably potential for further space savings. Further investigation is required to evaluate the cost/benefit of these.
It should be 24.04
in the PR title instead of 22.04.
.
Dockerfile
Outdated
ENV LC_CTYPE C.UTF-8 | ||
ENV LANG=C.UTF-8 | ||
ENV LC_ALL=C.UTF-8 | ||
ENV LC_CTYPE=C.UTF-8 | ||
ENV DEBIAN_FRONTEND=noninteractive | ||
RUN apt-get update && apt-get install -y software-properties-common && add-apt-repository -y ppa:mhier/libboost-latest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is no longer required. The used boost packages are from the official package repositories now.
RUN apt-get update && apt-get install -y software-properties-common && add-apt-repository -y ppa:mhier/libboost-latest |
ENV DEBIAN_FRONTEND=noninteractive | ||
RUN apt-get update && apt-get install -y wget python3-yaml unzip curl bzip2 pkg-config libicu-dev python3-icu libgomp1 uuid-runtime make lbzip2 libjemalloc-dev libzstd-dev libssl-dev libboost1.81-dev libboost-program-options1.81-dev libboost-iostreams1.81-dev libboost-url1.81-dev | ||
RUN apt-get update && apt-get install -y wget python3-yaml unzip curl bzip2 pkg-config libicu-dev python3-icu libgomp1 uuid-runtime make lbzip2 libjemalloc-dev libzstd-dev libssl-dev libboost1.83-dev libboost-program-options1.83-dev libboost-iostreams1.83-dev libboost-url1.83-dev pipx bash-completion |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove apt
cache.
RUN apt-get update && apt-get install -y wget python3-yaml unzip curl bzip2 pkg-config libicu-dev python3-icu libgomp1 uuid-runtime make lbzip2 libjemalloc-dev libzstd-dev libssl-dev libboost1.83-dev libboost-program-options1.83-dev libboost-iostreams1.83-dev libboost-url1.83-dev pipx bash-completion | |
RUN apt-get update && apt-get install -y wget python3-yaml unzip curl bzip2 pkg-config libicu-dev python3-icu libgomp1 uuid-runtime make lbzip2 libjemalloc-dev libzstd-dev libssl-dev libboost1.83-dev libboost-program-options1.83-dev libboost-iostreams1.83-dev libboost-url1.83-dev pipx bash-completion && rm -rf /var/lib/apt/lists/* |
These two environment variables are useful when running the qlever CLI within a Docker container, introduced by ad-freiburg/qlever#1439 The effect of QLEVER_OVERRIDE_SYSTEM_NATIVE is that after `qlever setup-config <config name>`, the Qleverfile will contain the setting `SYSTEM = native` and not `SYSTEM = docker`. The effect of QLEVER_OVERRIDE_DISABLE_UI is that `qlever ui` is disabled, which makes sense when running inside a container. In both cases, meaningful log message are provided for the user.
With ad-freiburg/qlever#1439, the recommended way to run QLever inside a container is to use the https://github.com/ad-freiburg/qlever-control script. Inside of the container, the environment variable `QLEVER_IS_RUNNING_IN_CONTAINER` is set, with the following two effects: 1. After `qlever setup-config <config name>`, the Qleverfile will contain the setting `SYSTEM = native` and not `SYSTEM = docker` and a warning message will be displayed that that is the case 2. When running `qlever ui`, an error message will be displayed that this command is disabled (reason: `qlever ui` runs another container, and we don't want containers inside of containers).
Conformance check passed ✅No test result changes. |
Quality Gate passedIssues Measures |
The old Dockerfile called
ServerMain
directly using a small selection of environment variables with outdated names. It was also outdated in other respects.The new Dockerfile installs the
qlever
script, so that it can be called from inside the container.Remaining questions / TODOs, feedback welcome:
Right now, the script is installed as part of the docker build via
pipx install qlever
. Is this the right way to do it? Alternatives would be to clone the GitHub repo andpipx install -e .
from there, or include the the GitHub repo as a submodule of this repository.How do we handle the QLever UI. We could just call
qlever ui
from inside the container. But that would pull the Docker image for the Qlever UI and run a Docker container inside of a Docker container. It's possible, but not the right way to do it. If both are needed, the container for the QLever engine and the contaner for the QLever UI should run side by side.The
qlever setup-config
command should have options that overwrite the variables in the produced Qleverfile. In particular, there should be an option for settingSYSTEM = native
. Otherwise it has to be stated explictly for each command, where that is relevant (in particular:qlever index
,qlever start
,qlever example-queries
).