Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🔐 Reduce Docker size by half + improve security #434

Closed
wants to merge 17 commits into from

Conversation

timoa
Copy link

@timoa timoa commented Oct 24, 2024

Details

This PR optimized the Dockerfiles (frontend + backend), significantly reducing the image size and improving security by running as a non-root user (node).

Image Current size New size Improvement
perplexica-frontend 2.37GB 846MB -64%
perplexica-backend 1.84GB 640MB -65%

Here are the fundamental changes and explanations:

  1. Multi-stage build: I used a two-stage build process. The first stage (builder) installs dependencies and builds the application. The second stage only copies the necessary files for running the application.

  2. Removed ARG variables on the backend image: Since Docker Compose or Kubernetes will provide the environment variables, I removed them from the Dockerfile. We can set these variables in your Docker Compose file or Kubernetes deployment configuration.

  3. Optimized copying and building: I first copy only the package.json and yarn.lock files, then install dependencies. This allows better caching of the dependency installation step.

  4. Minimal production image: The final stage only copies the built assets, node_modules, and necessary files from the builder stage, resulting in a much smaller Docker image.

  5. Use a more standard folder for the app: I replaced the /home/perplexica with /app and updated the docker-compose.yaml file to this new path.

  6. Use a non-root user (node): Instead of using the root user by default, I changed the container user to node (the default user for official Node Docker images). I had to set this user's permissions on the Dockerfiles and the docker-compose.yaml to avoid permission issues on the SQLite DB file.

  7. Use an ARG variable for the backend image to use by default the node user when running on Kubernetes and the root user if running with Docker Compose. The Docker Compose volumes are created with the root user, and the SQLite DB is accessible only as read-only if running the node user.

Important

The downside to running the backend Docker image as a non-root user is that the Docker Compose will need to run only with the --build argument since the Docker images will be published with the node user by default.

We will not have this issue with the SQLite DB permissions on Kubernetes because the volumes are managed differently.

Moving to a Postgres DB will fix this issue and help scale the project later. The Docker Compose will be able to launch a Postgres image, and it will be the same for Kubernetes with a dedicated pod or managed database like AWS RDS.

You can keep the root user for the backend image if you think that is too much for simple use with the Docker Compose file, but using the non-root user (node) will be more secure.

@timoa timoa marked this pull request as ready for review October 24, 2024 14:53
@timoa timoa changed the title Reduce Docker size by half + improve security 🔐 Reduce Docker size by half + improve security Oct 25, 2024
# Build stage
#############################

FROM node:18-slim AS builder
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason to not use 22 as was did on frontend?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The backend currently uses Node 18, and I thought it might not be compatible with higher Node versions.
The frontend doesn't have a version set and uses, by default, the latest Node version. I set the latest LTS (20), but happy to change it to the latest if the Node version is not an issue.

Copy link

@rrfaria rrfaria left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe you may use at least node 20 stable LTS version on backend. All dependencies on project seem to be compatible with newest version. But the way you did it works fine. try to make a small test and see if it works

@timoa
Copy link
Author

timoa commented Oct 27, 2024

I tried to build the backend with the Node 20 LTS, but it worked as expected.

I get the error below in the browser console. I don't know if it's related to using Node 20 or the latest version of the code.
I will investigate more before updating the PR.

http://localhost:3000/undefined/models 404 (Not Found)

@ngdbao
Copy link

ngdbao commented Oct 29, 2024

Thanks. These improves are necessary for a production-driven project.

@timoa
Copy link
Author

timoa commented Oct 29, 2024

Thanks @ngdbao! I already made the Helm Chart for Perplexica, but I needed to have lighter and secure images before launching it my Kubernetes clusters 😅
I will continue to investigate the issue with the build and update this PR.

@ItzCrazyKns ItzCrazyKns marked this pull request as draft October 29, 2024 14:14
@ItzCrazyKns
Copy link
Owner

Converted it to a draft instead, you can mark this as open once it is completed.

@timoa timoa marked this pull request as ready for review November 14, 2024 22:30
@timoa timoa requested a review from rrfaria November 14, 2024 22:30
@timoa timoa marked this pull request as draft November 14, 2024 22:31
@timoa
Copy link
Author

timoa commented Nov 14, 2024

I created a new PR because I broke my branch with my last rebase

@timoa timoa closed this Nov 14, 2024
@timoa timoa deleted the fix/docker-size-and-security branch November 14, 2024 22:46
@timoa
Copy link
Author

timoa commented Nov 14, 2024

The new PR is here: #465

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants