Releases: tensorchord/VectorChord
0.3.0
Features
Native support for the maxsim operator and efficient indexing inspired by XTR-WARP project. This makes it possible to build ColBERT- or ColPaLI-style multi-vector retrieval applications seamlessly within PostgreSQL.
Improvements
- More KMeans parameters can be configured
- Better progress report for internal KMeans build
What's Changed
- feat: check quals to skip rerank if
rerank_in_table
is enabled by @usamoi in #206 - feat: maxsim operator and indexing on maxsim operator by @usamoi in #197
- chore: update dependencies by @usamoi in #216
- feat: use mimalloc by @usamoi in #217
- feat: emit INFO when performing kmeans by @usamoi in #219
- feat: encode kmeans progress to phase name by @usamoi in #220
- readme: move docs to docs repo by @usamoi in #222
- chore: change default lists and probes to empty by @usamoi in #223
- feat: allows skipping rerank by @usamoi in #225
- delete cnpg image by @xieydd in #224
- refactor: remove allows_skipping_rerank by @usamoi in #226
- fix: apply epsilon for default search by @usamoi in #228
- chore: remove patch of crate half by @usamoi in #231
- fix: release env vars by @cutecutecat in #232
- chore: update 0.3.0 schema script by @usamoi in #230
Full Changelog: 0.2.2...0.3.0
0.2.2
What's Changed
- Support linux/arm64 for vchord-cnpg docker image by @xieydd in #184
- refactor: split quantized data vectors to two tapes by @usamoi in #196
- chore: add ghcr image by @cutecutecat in #200
- chore: update readme and scripts with url by @cutecutecat in #198
- fix: lint by ruff by @cutecutecat in #201
- fix: wrong operators for halfvec by @cutecutecat in #202
- doc: add new options by @usamoi in #207
- fix: do not leak memory in heap fetch and fix reading tuples in prewarm by @usamoi in #205
- fix: recycle pages in maintainance by @usamoi in #204
- chore: update v0.2.2 schema script by @usamoi in #210
- fix: correct logic of marking free pages by @usamoi in #211
- fix: remove heapify by @usamoi in #213
- fix error in cnpg amd64 Dockerfile by @xieydd in #214
- ci: fix release docker build by @usamoi in #215
Full Changelog: 0.2.1...0.2.2
0.2.1
Major Improvement
We optimize the external centroid index building speed, about 30%. Now it takes about 30h to build index for 100M vectors with only 4 vcpu on i4i.xlarge.
What's Changed
- refactor: move algorithm to a crate by @usamoi in #172
- feat: pinning index in memory when building, second try by @usamoi in #181
- fix: use linked list of vectors to skip realloc by @usamoi in #182
- feat: use select algorithm to replace heap, if k in top-k is expected to be small by @usamoi in #183
- ci: install pg13 in docker image by @usamoi in #186
- ci: use less docker by @usamoi in #187
- feat: rerank by fetching vectors in heap table by @usamoi in #189
- ci: enable CI for pg13 by @usamoi in #185
- chore: update dependencies by @usamoi in #190
- refactor: remove meaningless target feature requirements by @usamoi in #192
- fix: test simd operations in emulator by @usamoi in #193
- feat: neon impl of u8::reduce_sum_of_x by @usamoi in #194
- chore: update 0.2.1 schema (upgrade) script by @usamoi in #195
Full Changelog: 0.2.0...0.2.1
0.2.0
VectorChord 0.2 Release Notes
We are thrilled to announce the release of VectorChord 0.2, advancing vector search capabilities within PostgreSQL.
🚀 New Features
Optimized Storage Layout
- Long Cross-Page Vector Support: Redesigned internal storage allows vectors to span multiple 8KB PostgreSQL pages, enabling support for vectors with over 2000 dimensions, up to 16000 dim.
- Enhanced Storage Efficiency: Achieves higher storage density by minimizing wasted space, reducing index size by up to 50% compared to version 0.1.
Additional Data Types
- Float16 Support: Introduces Float16 data type, allowing users to halve the storage space required with a slight decrease in recall. Note that Float16 does not reduce the size of quantized vectors, maintaining 1 bit per dimension for original vector representation.
Architecture Enhancements
- ARM Architecture Support: Rewritten distance calculations and Fast Scan implementations using the Scalable Vector Extension (SVE) instruction set for optimal performance on ARM-based systems.
- AWS Graviton4 Compatibility: Leverage the latest i8g platform based on Graviton4 processors for improved performance at the same cost as i4i models.
⚡ Performance Improvements
- Reduced Index Size: Up to 50% reduction in index size compared to version 0.1.
🔧 Getting Started
- Comprehensive getting started guides will be available soon.
📝 Summary
VectorChord 0.2 introduces support for high-dimensional vectors, Float16 data type, ARM architecture optimizations, and a more compact storage layout. These enhancements collectively improve storage efficiency and query performance, providing a superior vector retrieval experience within PostgreSQL.
What's Changed
- fix: Unlock the conversation in CLA bot by @gaocegege in #86
- fix: CI env SEMVER by @kemingy in #87
- fix: format by @cutecutecat in #90
- feat: max_scan_tuples by @usamoi in #94
- test: fix push down tests by @cutecutecat in #95
- fix: set max dimension to 1600 in readme by @usamoi in #97
- fix: set max dimension to 1600 by @usamoi in #98
- docs: Update README by @VoVAllen in #103
- feat: support up to 60000 dimensions by @usamoi in #100
- chore: directory structure by @usamoi in #104
- feat: vchordrqfscan by @usamoi in #105
- chore: add CI to build the pgrx image by @kemingy in #107
- chore: fix the pgrx ci by @kemingy in #108
- fix: optimize insertions in building when lists = 1 by @usamoi in #106
- chore: build multiarch docker images by @kemingy in #112
- docs(readme): fix markdown style for docker run by @kemingy in #113
- feat: improve internal build by @usamoi in #115
- add enterprise image build step to ci by @xieydd in #114
- chore(README): Add some benchmark data by @gaocegege in #126
- fix: use stable toolchain by default by @usamoi in #128
- feat: scalar8 & indexing on halfvec by @usamoi in #131
- feat: Use dual license (AGPLv3 and ELv2) by @gaocegege in #130
- chore: update base by @usamoi in #137
- fix: remove sudo in dockerfile by @usamoi in #138
- fix: ci by @usamoi in #139
- fix: preprocess for halfvec by @usamoi in #140
- docs: update README for clarity and new features by @VoVAllen in #142
- fix: set an implicit root in external build if parents are not set by @usamoi in #147
- chore: type check and external test by @cutecutecat in #149
- chore: update dependencies by @usamoi in #151
- fix: update docker image in ci by @usamoi in #152
- Update enterprise dockerfile by @xieydd in #148
- feat: build multiarch pgrx image by @kemingy in #153
- chore: impl dereference traits for page guards by @usamoi in #156
- chore: fix the psql & release CI target, update readme by @kemingy in #158
- chore: add postgres sqllogicaltest for arm by @kemingy in #159
- chore: update link in readme by @VoVAllen in #160
- refactor: move pgvecto.rs base to this repo by @usamoi in #161
- fix: pick feat back to vchordrqfscan by @usamoi in #162
- chore: fix discord and x badge by @kemingy in #170
- feat: unify vchordrq and vchordrqfscan by @usamoi in #167
- fix: respect aliasing rule by not reading past of reference by @usamoi in #169
- fix: correct output of prewarm by @usamoi in #173
- fix: add magic number and version number to meta tuple by @usamoi in #174
- Release 0.2.0 by @cutecutecat in #177
- chore: install zip in tensorchord-pgrx by @usamoi in #178
- fix: release package name, version and licenses by @usamoi in #179
New Contributors
Full Changelog: 0.1.0...0.2.0
0.1.1-alpha.1
Highlights
- Support fp16 vec
- Support vector longer than 2000 dim
What's Changed
- fix: Unlock the conversation in CLA bot by @gaocegege in #86
- fix: CI env SEMVER by @kemingy in #87
- fix: format by @cutecutecat in #90
- feat: max_scan_tuples by @usamoi in #94
- test: fix push down tests by @cutecutecat in #95
- fix: set max dimension to 1600 in readme by @usamoi in #97
- fix: set max dimension to 1600 by @usamoi in #98
- docs: Update README by @VoVAllen in #103
- feat: support up to 60000 dimensions by @usamoi in #100
- chore: directory structure by @usamoi in #104
- feat: vchordrqfscan by @usamoi in #105
- chore: add CI to build the pgrx image by @kemingy in #107
- chore: fix the pgrx ci by @kemingy in #108
- fix: optimize insertions in building when lists = 1 by @usamoi in #106
- chore: build multiarch docker images by @kemingy in #112
- docs(readme): fix markdown style for docker run by @kemingy in #113
- feat: improve internal build by @usamoi in #115
- add enterprise image build step to ci by @xieydd in #114
- chore(README): Add some benchmark data by @gaocegege in #126
- fix: use stable toolchain by default by @usamoi in #128
- feat: scalar8 & indexing on halfvec by @usamoi in #131
- feat: Use dual license (AGPLv3 and ELv2) by @gaocegege in #130
- chore: update base by @usamoi in #137
- fix: remove sudo in dockerfile by @usamoi in #138
- fix: ci by @usamoi in #139
- fix: preprocess for halfvec by @usamoi in #140
New Contributors
Full Changelog: 0.1.0...0.1.1-alpha.1
0.1.0
chore: fix release ci (#85) Signed-off-by: Keming <[email protected]>