Skip to content

Commit eba908c

Browse files
Merge tag 'v0.24.0' into upgrade-0.24.1
2 parents bf7da4c + 392ec99 commit eba908c

File tree

1,006 files changed

+75006
-52469
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

1,006 files changed

+75006
-52469
lines changed

.github/copilot-instructions.md

Lines changed: 22 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1,22 @@
1-
Refer to [AGENTS.MD](../AGENTS.md) for all repo instructions.
1+
# Project instructions for Copilot
2+
3+
## How to run (minimum)
4+
- Install:
5+
- python -m venv .venv && source .venv/bin/activate
6+
- pip install -r requirements.txt
7+
- Run:
8+
- (fill) e.g. uvicorn app.main:app --reload
9+
- Verify:
10+
- (fill) curl http://127.0.0.1:8000/health
11+
12+
## Project layout (what matters)
13+
- app/: API entrypoints + routers
14+
- services/: business logic
15+
- configs/: config loading (.env)
16+
- docs/: documents
17+
- tests/: pytest
18+
19+
## Conventions
20+
- Prefer small, incremental changes.
21+
- Add logging for new flows.
22+
- Add/adjust tests for behavior changes.

.github/workflows/tests.yml

Lines changed: 316 additions & 9 deletions
Large diffs are not rendered by default.

.gitignore

Lines changed: 14 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -44,13 +44,21 @@ cl100k_base.tiktoken
4444
chrome*
4545
huggingface.co/
4646
nltk_data/
47+
uv-x86_64*.tar.gz
4748

4849
# Exclude hash-like temporary files like 9b5ad71b2ce5302211f9c61530b329a4922fc6a4
4950
*[0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f]*
5051
.lh/
5152
.venv
5253
docker/data
5354

55+
# OceanBase data and conf
56+
docker/oceanbase/conf
57+
docker/oceanbase/data
58+
59+
# SeekDB data and conf
60+
docker/seekdb
61+
5462

5563
#--------------------------------------------------#
5664
# The following was generated with gitignore.nvim: #
@@ -197,4 +205,9 @@ ragflow_cli.egg-info
197205
backup
198206

199207

200-
.hypothesis
208+
.hypothesis
209+
210+
211+
# Added by cargo
212+
213+
/target

CLAUDE.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on d
2727
- **Document Processing**: `deepdoc/` - PDF parsing, OCR, layout analysis
2828
- **LLM Integration**: `rag/llm/` - Model abstractions for chat, embedding, reranking
2929
- **RAG Pipeline**: `rag/flow/` - Chunking, parsing, tokenization
30-
- **Graph RAG**: `graphrag/` - Knowledge graph construction and querying
30+
- **Graph RAG**: `rag/graphrag/` - Knowledge graph construction and querying
3131

3232
### Agent System (`/agent/`)
3333
- **Components**: Modular workflow components (LLM, retrieval, categorize, etc.)
@@ -113,4 +113,4 @@ RAGFlow supports switching between Elasticsearch (default) and Infinity:
113113
- Node.js >=18.20.4
114114
- Docker & Docker Compose
115115
- uv package manager
116-
- 16GB+ RAM, 50GB+ disk space
116+
- 16GB+ RAM, 50GB+ disk space

Dockerfile

Lines changed: 25 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -19,17 +19,16 @@ RUN --mount=type=bind,from=infiniflow/ragflow_deps:latest,source=/huggingface.co
1919
# This is the only way to run python-tika without internet access. Without this set, the default is to check the tika version and pull latest every time from Apache.
2020
RUN --mount=type=bind,from=infiniflow/ragflow_deps:latest,source=/,target=/deps \
2121
cp -r /deps/nltk_data /root/ && \
22-
cp /deps/tika-server-standard-3.0.0.jar /deps/tika-server-standard-3.0.0.jar.md5 /ragflow/ && \
22+
cp /deps/tika-server-standard-3.2.3.jar /deps/tika-server-standard-3.2.3.jar.md5 /ragflow/ && \
2323
cp /deps/cl100k_base.tiktoken /ragflow/9b5ad71b2ce5302211f9c61530b329a4922fc6a4
2424

25-
ENV TIKA_SERVER_JAR="file:///ragflow/tika-server-standard-3.0.0.jar"
25+
ENV TIKA_SERVER_JAR="file:///ragflow/tika-server-standard-3.2.3.jar"
2626
ENV DEBIAN_FRONTEND=noninteractive
2727

2828
# Setup apt
2929
# Python package and implicit dependencies:
3030
# opencv-python: libglib2.0-0 libglx-mesa0 libgl1
31-
# aspose-slides: pkg-config libicu-dev libgdiplus libssl1.1_1.1.1f-1ubuntu2_amd64.deb
32-
# python-pptx: default-jdk tika-server-standard-3.0.0.jar
31+
# python-pptx: default-jdk tika-server-standard-3.2.3.jar
3332
# selenium: libatk-bridge2.0-0 chrome-linux64-121-0-6167-85
3433
# Building C extensions: libpython3-dev libgtk-4-1 libnss3 xdg-utils libgbm-dev
3534
RUN --mount=type=cache,id=ragflow_apt,target=/var/cache/apt,sharing=locked \
@@ -49,11 +48,21 @@ RUN --mount=type=cache,id=ragflow_apt,target=/var/cache/apt,sharing=locked \
4948
apt install -y libatk-bridge2.0-0 && \
5049
apt install -y libpython3-dev libgtk-4-1 libnss3 xdg-utils libgbm-dev && \
5150
apt install -y libjemalloc-dev && \
52-
apt install -y nginx unzip curl wget git vim less && \
51+
apt install -y gnupg unzip curl wget git vim less && \
5352
apt install -y ghostscript && \
5453
apt install -y pandoc && \
5554
apt install -y texlive && \
56-
apt install -y fonts-freefont-ttf fonts-noto-cjk
55+
apt install -y fonts-freefont-ttf fonts-noto-cjk && \
56+
apt install -y postgresql-client
57+
58+
ARG NGINX_VERSION=1.29.5-1~noble
59+
RUN --mount=type=cache,id=ragflow_apt,target=/var/cache/apt,sharing=locked \
60+
mkdir -p /etc/apt/keyrings && \
61+
curl -fsSL https://nginx.org/keys/nginx_signing.key | gpg --dearmor -o /etc/apt/keyrings/nginx-archive-keyring.gpg && \
62+
echo "deb [signed-by=/etc/apt/keyrings/nginx-archive-keyring.gpg] https://nginx.org/packages/mainline/ubuntu/ noble nginx" > /etc/apt/sources.list.d/nginx.list && \
63+
apt update && \
64+
apt install -y nginx=${NGINX_VERSION} && \
65+
apt-mark hold nginx
5766

5867
# Install uv
5968
RUN --mount=type=bind,from=infiniflow/ragflow_deps:latest,source=/,target=/deps \
@@ -64,10 +73,12 @@ RUN --mount=type=bind,from=infiniflow/ragflow_deps:latest,source=/,target=/deps
6473
echo 'url = "https://pypi.tuna.tsinghua.edu.cn/simple"' >> /etc/uv/uv.toml && \
6574
echo 'default = true' >> /etc/uv/uv.toml; \
6675
fi; \
67-
tar xzf /deps/uv-x86_64-unknown-linux-gnu.tar.gz \
68-
&& cp uv-x86_64-unknown-linux-gnu/* /usr/local/bin/ \
69-
&& rm -rf uv-x86_64-unknown-linux-gnu \
70-
&& uv python install 3.11
76+
arch="$(uname -m)"; \
77+
if [ "$arch" = "x86_64" ]; then uv_arch="x86_64"; else uv_arch="aarch64"; fi; \
78+
tar xzf "/deps/uv-${uv_arch}-unknown-linux-gnu.tar.gz" \
79+
&& cp "uv-${uv_arch}-unknown-linux-gnu/"* /usr/local/bin/ \
80+
&& rm -rf "uv-${uv_arch}-unknown-linux-gnu" \
81+
&& uv python install 3.12
7182

7283
ENV PYTHONDONTWRITEBYTECODE=1 DOTNET_SYSTEM_GLOBALIZATION_INVARIANT=1
7384
ENV PATH=/root/.local/bin:$PATH
@@ -125,8 +136,6 @@ RUN --mount=type=bind,from=infiniflow/ragflow_deps:latest,source=/chromedriver-l
125136
mv chromedriver /usr/local/bin/ && \
126137
rm -f /usr/bin/google-chrome
127138

128-
# https://forum.aspose.com/t/aspose-slides-for-net-no-usable-version-of-libssl-found-with-linux-server/271344/13
129-
# aspose-slides on linux/arm64 is unavailable
130139
RUN --mount=type=bind,from=infiniflow/ragflow_deps:latest,source=/,target=/deps \
131140
if [ "$(uname -m)" = "x86_64" ]; then \
132141
dpkg -i /deps/libssl1.1_1.1.1f-1ubuntu2_amd64.deb; \
@@ -152,11 +161,14 @@ RUN --mount=type=cache,id=ragflow_uv,target=/root/.cache/uv,sharing=locked \
152161
else \
153162
sed -i 's|pypi.tuna.tsinghua.edu.cn|pypi.org|g' uv.lock; \
154163
fi; \
155-
uv sync --python 3.12 --frozen
164+
uv sync --python 3.12 --frozen && \
165+
# Ensure pip is available in the venv for runtime package installation (fixes #12651)
166+
.venv/bin/python3 -m ensurepip --upgrade
156167

157168
COPY web web
158169
COPY docs docs
159170
RUN --mount=type=cache,id=ragflow_npm,target=/root/.npm,sharing=locked \
171+
export NODE_OPTIONS="--max-old-space-size=4096" && \
160172
cd web && npm install && npm run build
161173

162174
COPY .git /ragflow/.git
@@ -186,11 +198,8 @@ COPY conf conf
186198
COPY deepdoc deepdoc
187199
COPY rag rag
188200
COPY agent agent
189-
COPY graphrag graphrag
190-
COPY agentic_reasoning agentic_reasoning
191201
COPY pyproject.toml uv.lock ./
192202
COPY mcp mcp
193-
COPY plugin plugin
194203
COPY common common
195204
COPY memory memory
196205

Dockerfile.deps

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
FROM scratch
44

55
# Copy resources downloaded via download_deps.py
6-
COPY chromedriver-linux64-121-0-6167-85 chrome-linux64-121-0-6167-85 cl100k_base.tiktoken libssl1.1_1.1.1f-1ubuntu2_amd64.deb libssl1.1_1.1.1f-1ubuntu2_arm64.deb tika-server-standard-3.0.0.jar tika-server-standard-3.0.0.jar.md5 libssl*.deb uv-x86_64-unknown-linux-gnu.tar.gz /
6+
COPY chromedriver-linux64-121-0-6167-85 chrome-linux64-121-0-6167-85 cl100k_base.tiktoken libssl1.1_1.1.1f-1ubuntu2_amd64.deb libssl1.1_1.1.1f-1ubuntu2_arm64.deb tika-server-standard-3.2.3.jar tika-server-standard-3.2.3.jar.md5 libssl*.deb uv-x86_64-unknown-linux-gnu.tar.gz uv-aarch64-unknown-linux-gnu.tar.gz /
77

88
COPY nltk_data /nltk_data
99

README.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@
2222
<img alt="Static Badge" src="https://img.shields.io/badge/Online-Demo-4e6b99">
2323
</a>
2424
<a href="https://hub.docker.com/r/infiniflow/ragflow" target="_blank">
25-
<img src="https://img.shields.io/docker/pulls/infiniflow/ragflow?label=Docker%20Pulls&color=0db7ed&logo=docker&logoColor=white&style=flat-square" alt="docker pull infiniflow/ragflow:v0.23.1">
25+
<img src="https://img.shields.io/docker/pulls/infiniflow/ragflow?label=Docker%20Pulls&color=0db7ed&logo=docker&logoColor=white&style=flat-square" alt="docker pull infiniflow/ragflow:v0.24.0">
2626
</a>
2727
<a href="https://github.com/infiniflow/ragflow/releases/latest">
2828
<img src="https://img.shields.io/github/v/release/infiniflow/ragflow?color=blue&label=Latest%20Release" alt="Latest Release">
@@ -72,7 +72,7 @@
7272

7373
## 💡 What is RAGFlow?
7474

75-
[RAGFlow](https://ragflow.io/) is a leading open-source Retrieval-Augmented Generation (RAG) engine that fuses cutting-edge RAG with Agent capabilities to create a superior context layer for LLMs. It offers a streamlined RAG workflow adaptable to enterprises of any scale. Powered by a converged context engine and pre-built agent templates, RAGFlow enables developers to transform complex data into high-fidelity, production-ready AI systems with exceptional efficiency and precision.
75+
[RAGFlow](https://ragflow.io/) is a leading open-source Retrieval-Augmented Generation ([RAG](https://ragflow.io/basics/what-is-rag)) engine that fuses cutting-edge RAG with Agent capabilities to create a superior context layer for LLMs. It offers a streamlined RAG workflow adaptable to enterprises of any scale. Powered by a converged [context engine](https://ragflow.io/basics/what-is-agent-context-engine) and pre-built agent templates, RAGFlow enables developers to transform complex data into high-fidelity, production-ready AI systems with exceptional efficiency and precision.
7676

7777
## 🎮 Demo
7878

@@ -188,15 +188,15 @@ releases! 🌟
188188
> All Docker images are built for x86 platforms. We don't currently offer Docker images for ARM64.
189189
> If you are on an ARM64 platform, follow [this guide](https://ragflow.io/docs/dev/build_docker_image) to build a Docker image compatible with your system.
190190
191-
> The command below downloads the `v0.23.1` edition of the RAGFlow Docker image. See the following table for descriptions of different RAGFlow editions. To download a RAGFlow edition different from `v0.23.1`, update the `RAGFLOW_IMAGE` variable accordingly in **docker/.env** before using `docker compose` to start the server.
191+
> The command below downloads the `v0.24.0` edition of the RAGFlow Docker image. See the following table for descriptions of different RAGFlow editions. To download a RAGFlow edition different from `v0.24.0`, update the `RAGFLOW_IMAGE` variable accordingly in **docker/.env** before using `docker compose` to start the server.
192192
193193
```bash
194194
$ cd ragflow/docker
195-
196-
# git checkout v0.23.1
195+
196+
# git checkout v0.24.0
197197
# Optional: use a stable tag (see releases: https://github.com/infiniflow/ragflow/releases)
198198
# This step ensures the **entrypoint.sh** file in the code matches the Docker image version.
199-
199+
200200
# Use CPU for DeepDoc tasks:
201201
$ docker compose -f docker-compose.yml up -d
202202

README_id.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@
2222
<img alt="Lencana Daring" src="https://img.shields.io/badge/Online-Demo-4e6b99">
2323
</a>
2424
<a href="https://hub.docker.com/r/infiniflow/ragflow" target="_blank">
25-
<img src="https://img.shields.io/docker/pulls/infiniflow/ragflow?label=Docker%20Pulls&color=0db7ed&logo=docker&logoColor=white&style=flat-square" alt="docker pull infiniflow/ragflow:v0.23.1">
25+
<img src="https://img.shields.io/docker/pulls/infiniflow/ragflow?label=Docker%20Pulls&color=0db7ed&logo=docker&logoColor=white&style=flat-square" alt="docker pull infiniflow/ragflow:v0.24.0">
2626
</a>
2727
<a href="https://github.com/infiniflow/ragflow/releases/latest">
2828
<img src="https://img.shields.io/github/v/release/infiniflow/ragflow?color=blue&label=Rilis%20Terbaru" alt="Rilis Terbaru">
@@ -72,7 +72,7 @@
7272

7373
## 💡 Apa Itu RAGFlow?
7474

75-
[RAGFlow](https://ragflow.io/) adalah mesin RAG (Retrieval-Augmented Generation) open-source terkemuka yang mengintegrasikan teknologi RAG mutakhir dengan kemampuan Agent untuk menciptakan lapisan kontekstual superior bagi LLM. Menyediakan alur kerja RAG yang efisien dan dapat diadaptasi untuk perusahaan segala skala. Didukung oleh mesin konteks terkonvergensi dan template Agent yang telah dipra-bangun, RAGFlow memungkinkan pengembang mengubah data kompleks menjadi sistem AI kesetiaan-tinggi dan siap-produksi dengan efisiensi dan presisi yang luar biasa.
75+
[RAGFlow](https://ragflow.io/) adalah mesin [RAG](https://ragflow.io/basics/what-is-rag) (Retrieval-Augmented Generation) open-source terkemuka yang mengintegrasikan teknologi RAG mutakhir dengan kemampuan Agent untuk menciptakan lapisan kontekstual superior bagi LLM. Menyediakan alur kerja RAG yang efisien dan dapat diadaptasi untuk perusahaan segala skala. Didukung oleh mesin konteks terkonvergensi dan template Agent yang telah dipra-bangun, RAGFlow memungkinkan pengembang mengubah data kompleks menjadi sistem AI kesetiaan-tinggi dan siap-produksi dengan efisiensi dan presisi yang luar biasa.
7676

7777
## 🎮 Demo
7878

@@ -188,12 +188,12 @@ Coba demo kami di [https://demo.ragflow.io](https://demo.ragflow.io).
188188
> Semua gambar Docker dibangun untuk platform x86. Saat ini, kami tidak menawarkan gambar Docker untuk ARM64.
189189
> Jika Anda menggunakan platform ARM64, [silakan gunakan panduan ini untuk membangun gambar Docker yang kompatibel dengan sistem Anda](https://ragflow.io/docs/dev/build_docker_image).
190190
191-
> Perintah di bawah ini mengunduh edisi v0.23.1 dari gambar Docker RAGFlow. Silakan merujuk ke tabel berikut untuk deskripsi berbagai edisi RAGFlow. Untuk mengunduh edisi RAGFlow yang berbeda dari v0.23.1, perbarui variabel RAGFLOW_IMAGE di docker/.env sebelum menggunakan docker compose untuk memulai server.
191+
> Perintah di bawah ini mengunduh edisi v0.24.0 dari gambar Docker RAGFlow. Silakan merujuk ke tabel berikut untuk deskripsi berbagai edisi RAGFlow. Untuk mengunduh edisi RAGFlow yang berbeda dari v0.24.0, perbarui variabel RAGFLOW_IMAGE di docker/.env sebelum menggunakan docker compose untuk memulai server.
192192
193193
```bash
194194
$ cd ragflow/docker
195-
196-
# git checkout v0.23.1
195+
196+
# git checkout v0.24.0
197197
# Opsional: gunakan tag stabil (lihat releases: https://github.com/infiniflow/ragflow/releases)
198198
# This steps ensures the **entrypoint.sh** file in the code matches the Docker image version.
199199

README_ja.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@
2222
<img alt="Static Badge" src="https://img.shields.io/badge/Online-Demo-4e6b99">
2323
</a>
2424
<a href="https://hub.docker.com/r/infiniflow/ragflow" target="_blank">
25-
<img src="https://img.shields.io/docker/pulls/infiniflow/ragflow?label=Docker%20Pulls&color=0db7ed&logo=docker&logoColor=white&style=flat-square" alt="docker pull infiniflow/ragflow:v0.23.1">
25+
<img src="https://img.shields.io/docker/pulls/infiniflow/ragflow?label=Docker%20Pulls&color=0db7ed&logo=docker&logoColor=white&style=flat-square" alt="docker pull infiniflow/ragflow:v0.24.0">
2626
</a>
2727
<a href="https://github.com/infiniflow/ragflow/releases/latest">
2828
<img src="https://img.shields.io/github/v/release/infiniflow/ragflow?color=blue&label=Latest%20Release" alt="Latest Release">
@@ -53,7 +53,7 @@
5353

5454
## 💡 RAGFlow とは?
5555

56-
[RAGFlow](https://ragflow.io/) は、先進的なRAG(Retrieval-Augmented Generation)技術と Agent 機能を融合し、大規模言語モデル(LLM)に優れたコンテキスト層を構築する最先端のオープンソース RAG エンジンです。あらゆる規模の企業に対応可能な合理化された RAG ワークフローを提供し、統合型コンテキストエンジンと事前構築されたAgentテンプレートにより、開発者が複雑なデータを驚異的な効率性と精度で高精細なプロダクションレディAIシステムへ変換することを可能にします。
56+
[RAGFlow](https://ragflow.io/) は、先進的な[RAG](https://ragflow.io/basics/what-is-rag)(Retrieval-Augmented Generation)技術と Agent 機能を融合し、大規模言語モデル(LLM)に優れたコンテキスト層を構築する最先端のオープンソース RAG エンジンです。あらゆる規模の企業に対応可能な合理化された RAG ワークフローを提供し、統合型[コンテキストエンジン](https://ragflow.io/basics/what-is-agent-context-engine)と事前構築されたAgentテンプレートにより、開発者が複雑なデータを驚異的な効率性と精度で高精細なプロダクションレディAIシステムへ変換することを可能にします。
5757

5858
## 🎮 Demo
5959

@@ -168,12 +168,12 @@
168168
> 現在、公式に提供されているすべての Docker イメージは x86 アーキテクチャ向けにビルドされており、ARM64 用の Docker イメージは提供されていません。
169169
> ARM64 アーキテクチャのオペレーティングシステムを使用している場合は、[このドキュメント](https://ragflow.io/docs/dev/build_docker_image)を参照して Docker イメージを自分でビルドしてください。
170170
171-
> 以下のコマンドは、RAGFlow Docker イメージの v0.23.1 エディションをダウンロードします。異なる RAGFlow エディションの説明については、以下の表を参照してください。v0.23.1 とは異なるエディションをダウンロードするには、docker/.env ファイルの RAGFLOW_IMAGE 変数を適宜更新し、docker compose を使用してサーバーを起動してください。
171+
> 以下のコマンドは、RAGFlow Docker イメージの v0.24.0 エディションをダウンロードします。異なる RAGFlow エディションの説明については、以下の表を参照してください。v0.24.0 とは異なるエディションをダウンロードするには、docker/.env ファイルの RAGFLOW_IMAGE 変数を適宜更新し、docker compose を使用してサーバーを起動してください。
172172
173173
```bash
174174
$ cd ragflow/docker
175175

176-
# git checkout v0.23.1
176+
# git checkout v0.24.0
177177
# 任意: 安定版タグを利用 (一覧: https://github.com/infiniflow/ragflow/releases)
178178
# この手順は、コード内の entrypoint.sh ファイルが Docker イメージのバージョンと一致していることを確認します。
179179

@@ -194,8 +194,8 @@
194194

195195
> `v0.22.0` 以降、当プロジェクトでは slim エディションのみを提供し、イメージタグに **-slim** サフィックスを付けなくなりました。
196196
197-
1. サーバーを立ち上げた後、サーバーの状態を確認する:
198-
197+
1. サーバーを立ち上げた後、サーバーの状態を確認する:
198+
199199
```bash
200200
$ docker logs -f docker-ragflow-cpu-1
201201
```

0 commit comments

Comments
 (0)