-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🔐 Reduce Docker size by half + improve security #434
Conversation
backend.dockerfile
Outdated
# Build stage | ||
############################# | ||
|
||
FROM node:18-slim AS builder |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any reason to not use 22 as was did on frontend?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The backend currently uses Node 18, and I thought it might not be compatible with higher Node versions.
The frontend doesn't have a version set and uses, by default, the latest Node version. I set the latest LTS (20), but happy to change it to the latest if the Node version is not an issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe you may use at least node 20 stable LTS version on backend. All dependencies on project seem to be compatible with newest version. But the way you did it works fine. try to make a small test and see if it works
I tried to build the backend with the Node 20 LTS, but it worked as expected. I get the error below in the browser console. I don't know if it's related to using Node 20 or the latest version of the code. http://localhost:3000/undefined/models 404 (Not Found) |
Thanks. These improves are necessary for a production-driven project. |
Thanks @ngdbao! I already made the Helm Chart for Perplexica, but I needed to have lighter and secure images before launching it my Kubernetes clusters 😅 |
Converted it to a draft instead, you can mark this as open once it is completed. |
Docker Compose mount the volumes as root by default and the node user can't access the SQLite DB (read-only)
I created a new PR because I broke my branch with my last rebase |
The new PR is here: #465 |
Details
This PR optimized the Dockerfiles (frontend + backend), significantly reducing the image size and improving security by running as a non-root user (node).
Here are the fundamental changes and explanations:
Multi-stage build: I used a two-stage build process. The first stage (builder) installs dependencies and builds the application. The second stage only copies the necessary files for running the application.
Removed ARG variables on the backend image: Since Docker Compose or Kubernetes will provide the environment variables, I removed them from the Dockerfile. We can set these variables in your Docker Compose file or Kubernetes deployment configuration.
Optimized copying and building: I first copy only the package.json and yarn.lock files, then install dependencies. This allows better caching of the dependency installation step.
Minimal production image: The final stage only copies the built assets, node_modules, and necessary files from the builder stage, resulting in a much smaller Docker image.
Use a more standard folder for the app: I replaced the
/home/perplexica
with/app
and updated thedocker-compose.yaml
file to this new path.Use a non-root user (node): Instead of using the root user by default, I changed the container user to
node
(the default user for official Node Docker images). I had to set this user's permissions on the Dockerfiles and thedocker-compose.yaml
to avoid permission issues on the SQLite DB file.Use an ARG variable for the backend image to use by default the
node
user when running on Kubernetes and theroot
user if running with Docker Compose. The Docker Compose volumes are created with theroot
user, and the SQLite DB is accessible only as read-only if running the node user.Important
The downside to running the backend Docker image as a non-root user is that the Docker Compose will need to run only with the
--build
argument since the Docker images will be published with thenode
user by default.We will not have this issue with the SQLite DB permissions on Kubernetes because the volumes are managed differently.
Moving to a Postgres DB will fix this issue and help scale the project later. The Docker Compose will be able to launch a Postgres image, and it will be the same for Kubernetes with a dedicated pod or managed database like AWS RDS.
You can keep the root user for the backend image if you think that is too much for simple use with the Docker Compose file, but using the non-root user (
node
) will be more secure.