Releases: nomic-ai/gpt4all
Releases · nomic-ai/gpt4all
v2.8.0-pre1
What's New
- Context Menu: Replace "Select All" on message with "Copy Message" (#2324)
- Context Menu: Hide Copy/Cut when nothing is selected (#2324)
- Improve speed of context switch after quickly switching between several chats (#2343)
- New Chat: Always switch to the new chat when the button is clicked (#2330)
- New Chat: Always scroll to the top of the list when the button is clicked (#2330)
- Update to latest llama.cpp as of May 9, 2024 (#2310)
- Add support for the llama.cpp CUDA backend (#2310)
- Nomic Vulkan is still used by default, but CUDA devices can now be selected in Settings
- When in use: Greatly improved prompt processing and generation speed on some devices
- When in use: GPU support for Q5_0, Q5_1, Q8_0, K-quants, I-quants, and Mixtral
- Add support for InternLM models (#2310)
Fixes
- Do not allow sending a message while the LLM is responding (#2323)
- Fix poor quality of generated chat titles with many models (#2322)
- Set the window icon correctly on Windows (#2321)
- Fix a few memory leaks (#2328, #2348, #2310)
- Do not crash if a model file has no architecture key (#2346)
- Fix several instances of model loading progress displaying incorrectly (#2337, #2343)
- New Chat: Fix the new chat being scrolled above the top of the list on startup (#2330)
- macOS: Show a "Metal" device option, and actually use the CPU when "CPU" is selected (#2310)
- Remove unsupported Mamba, Persimmon, and PLaMo models from the whitelist (#2310)
Full Changelog: v2.7.5...v2.8.0-pre1
v2.7.5
What's New
Fixes
- Fix some issues with anonymous usage statistics (#2270, #2296)
- Default to GPU with most VRAM on Windows and Linux, not least (#2297)
- Fix initial failure to generate embeddings with Nomic Embed (#2284)
New Contributors
Full Changelog: v2.7.4...v2.7.5
v2.7.4
What's New
- Add a right-click menu to the chat (by @kryotek777 in #2108)
- Change the left sidebar to stay open (#2117)
- Limit the width of text in the chat (#2118)
- Move to llama.cpp's SBert implementation (#2086)
- Support models provided by the Mistral AI API (by @Olyxz16 in #2053)
- Models List: Add Ghost 7B v0.9.1 (by @lh0x00 in #2127)
- Add Documentation and FAQ links to the New Chat page (by @3Simplex in #2183)
- Models List: Simplify Mistral OpenOrca system prompt (#2220)
- Models List: Add Llama 3 Instruct (#2242)
- Models List: Add Phi-3 Mini Instruct (#2252)
- Improve accuracy of anonymous usage statistics (#2238)
Fixes
- Detect unsupported CPUs correctly on Windows (#2141)
- Fix the colors used by the server chat (#2150)
- Fix startup issues when encountering non-Latin characters in paths (#2162)
- Fix issues causing LocalDocs context links to not work sometimes (#2218)
- Fix incorrect display of certain code block syntax in the chat (#2232)
- Fix an issue causing unnecessary indexing of document collections on startup (#2236)
New Contributors
- @kryotek777 made their first contribution in #2108
- @xuzhen made their first contribution in #1928
- @Olyxz16 made their first contribution in #2053
- @bentleylong made their first contribution in #2138
- @Tim453 made their first contribution in #2136
- @lh0x00 made their first contribution in #2127
- @robinverduijn made their first contribution in #2180
- @3Simplex made their first contribution in #2183
Full Changelog: v2.7.3...v2.7.4
v2.7.3
What's New
- Groundwork for "removedIn" field of models3.json based on unused "deprecated" field (#2063)
- Implement warning dialog for old Mistral OpenOrca (#2034)
- Make deleting a chat significantly faster (#2081)
- New, smaller MPT model without duplicated tensor (#2006)
- Make API server port configurable by @danielmeloalencar in #1640
- Show settings for the currently selected model by default (by @chrisbarrera in #2099, e2f64f8)
- Keep installed models in the list when searching for models (5ed9aea)
Fixes
- Fix "Download models" button not appearing on some Linux systems (#2040)
- Fix undefined behavior in ChatLLM::resetContext (#2041)
- Fix ChatGPT restore from text after v2.7.1 (#2051 part 1)
- Fix ChatGPT using context from messages that are no longer in history (#2051 part 2)
- Fix TypeError warnings on exit (#2043)
- Fix startup speed regression from v2.7.2 (#2089, #2094)
- Do not show SBert and non-GUI models in choices for "Default model" (#2095)
- Do not list cloned models on downloads page (#2090)
- Do not attempt to show old, deleted models in downloads list (#2098)
- Fix inability to cancel model download (#2107)
New Contributors
- @danielmeloalencar made their first contribution in #1640
- @johannesploetner made their first contribution in #1979
Full Changelog: v2.7.2...v2.7.3
v2.7.2
What's New
- Model Discovery: Discover new LLMs from HuggingFace, right from GPT4All! (83c76be)
- Support GPU offload of Gemma's output tensor (#1997)
- Enable Kompute support for 10 more model architectures (#2005)
- These are Baichuan, Bert and Nomic Bert, CodeShell, GPT-2, InternLM, MiniCPM, Orion, Qwen, and StarCoder.
- Expose min_p sampling parameter of llama.cpp by @chrisbarrera in #2014
- Default to a blank line between reply and next prompt for templates without
%2
(#1996) - Add Nous-Hermes-2-Mistral-7B-DPO to official models list by @ThiloteE in #2027
Fixes
- Fix compilation warnings on macOS (e7f2ff1)
- Fix crash when ChatGPT API key is set, and hide non-ChatGPT settings properly (#2003)
- Fix crash when adding/removing a clone - a regression in v2.7.1 (#2031)
- Fix layer norm epsilon value in BERT model (#1946)
- Fix clones being created with the wrong number of GPU layers (#2011)
New Contributors
- @TareHimself made their first contribution in #1897
Full Changelog: v2.7.1...v2.7.2
v2.7.1
What's Changed
- Completely revamp model loading to support explicit unload/reload (#1969)
- We no longer load a model by default on application start
- We no longer load a model by default on chat context switch
- Save and restore of window geometry across application starts (#1989)
- Update to latest llama.cpp as of 2/21/2024 and add CPU/GPU support for Gemma (#1992)
- Also enable Vulkan GPU support for Phi and Phi-2, Qwen2, and StableLM
Fixes
- Fix visual artifact in update reminder dialog (16927d9)
- Blacklist Intel GPUs as they are still not supported (a1471be, nomic-ai/llama.cpp#14)
- Improve chat save/load speed (excluding startup/shutdown with defaults) (6fdec80, nomic-ai/llama.cpp#15)
- Significantly improve handling of chat-style prompt templates, and reupload Mistral OpenOrca (#1970, #1993)
New Contributors
Full Changelog: v2.7.0...v2.7.1
v2.7.0
What's Changed
- Add 12 new model architectures for CPU and Metal inference (#1914)
These are Baichuan, BLOOM, CodeShell, GPT-2, Orion, Persimmon, Phi and Phi-2, Plamo, Qwen, Qwen2, Refact, and StableLM.
We don't have official downloads for these yet, but TheBloke offers plenty of compatible GGUF quantizations. - Restore minimum window size of 720x480 (1b524c4)
- Use ChatML for Mistral OpenOrca to make its output formatting more reliable (#1935)
Bug Fixes
- Fix VRAM not being freed when CPU fallback occurs - this makes switching models more reliable (#1901)
- Disable offloading of Mixtral to GPU because we crash otherwise (#1931)
- Limit extensions scanned by LocalDocs to txt, pdf, md, rst - other formats were inserting useless binary data (#1912)
- Fix missing scrollbar for chat history (490404d)
- Accessibility improvements (4258bb1)
New Contributors
Full Changelog: v2.6.2...v2.7.0
v2.5.1
(The previous release of this version was mis-tagged - the release has been re-created and installers have been reuploaded.)
What's Changed
- Removed extraneous and shortened text in accessibility name and description fields by @vick08 in #1532
- Fix an issue on Windows where the UI wouldn't start, or stayed open in the background after closing it (#1556)
New Contributors
Full Changelog: v2.5.0...v2.5.1
v2.5.0
(The previous release of this version was mis-tagged - the release has been re-created and installers have been reuploaded.)
What's Changed
- Major new release supports GGUF models only!
- New models like Mistral Instruct, Replit 1.5, Rift Coder and more
- All previous version of ggml-based models are no longer supported
- Extensive changes to Vulkan support
- Prompt processing on the GPU
- chat: clearer CPU fallback messages (#1478)
- Restore state from text (#1493)
- Fix compatiblity with non-AVX2 CPUs (#1488)
- llmodel: print an error if the CPU does not support AVX (#1499)
- always save chats to disk, but save them as text by default (#1495)
New Contributors
- @kevinbazira made their first contribution in #1444
- @jphme made their first contribution in #1485
- @umarmnaq made their first contribution in #1446
- @lordofthejars made their first contribution in #1414
Full Changelog: v2.4.19...v2.5.0
v2.6.2
What's Changed
- Fix crash when deserializing chats with saved context from 2.5.x and earlier (#1859)
- New light mode and dark mode UI themes (#1876)
- Update to latest llama.cpp after merge of Nomic's Vulkan PR (#1819, #1883)
- Much faster prompt processing on Linux and Windows thanks to re-enabled GPU support in the Vulkan backend
- Support offloading only some layers of the model if you have less VRAM (#1890)
- Support Maxwell and Pascal Nvidia GPUs (#1895)
Fixes
- Don't show "retrieving localdocs" if there are no collections (#1874)
- Fix potential crash when loading fails due to insufficient VRAM (6db5307, Issue #1870)
- Fix VRAM leak when switching models (Issue #1840)
- Support Nomic Embed as LocalDocs embedding model via Atlas (d14b95f)
New Contributors
- @realKarthikNair made their first contribution in #1871
Full Changelog: v2.6.1...v2.6.2