[BUG]: "QNN Engine is offline." when using a Snapdragon X Plus laptop #2962

barealek · 2025-01-10T23:55:12Z

How are you running AnythingLLM?

AnythingLLM desktop app

What happened?

When trying to inference on any QNN model on a Snapdragon X Plus laptop, the issue below occurs.

The logs specifies that the required CPU/NPU is not found:

{
  "level": "info",
  "message": "\u001b[36m[QnnNativeEmbedder]\u001b[0m QNN API server is not supported on this platform - no valid CPU/NPU found. {\"validCores\":[\"Snapdragon(R) X Elite\"],\"cores\":[\"Snapdragon(R) X 10-core X1P64100 @ 3.40 GHz\",\"Snapdragon(R) X 10-core X1P64100 @ 3.40 GHz\",\"Snapdragon(R) X 10-core X1P64100 @ 3.40 GHz\",\"Snapdragon(R) X 10-core X1P64100 @ 3.40 GHz\",\"Snapdragon(R) X 10-core X1P64100 @ 3.40 GHz\",\"Snapdragon(R) X 10-core X1P64100 @ 3.40 GHz\",\"Snapdragon(R) X 10-core X1P64100 @ 3.40 GHz\",\"Snapdragon(R) X 10-core X1P64100 @ 3.40 GHz\",\"Snapdragon(R) X 10-core X1P64100 @ 3.40 GHz\",\"Snapdragon(R) X 10-core X1P64100 @ 3.40 GHz\"]}",
  "service": "backend"
}

Starting AnythingLLM and reproducing the error, the full log looks like this:

{"level":"info","message":"\u001b[36m[EncryptionManager]\u001b[0m Loaded existing key & salt for encrypting arbitrary data.","service":"backend"}
{"level":"info","message":"\u001b[32m[TELEMETRY ENABLED]\u001b[0m Anonymous Telemetry enabled. Telemetry helps Mintplex Labs Inc improve AnythingLLM.","service":"backend"}
{"level":"info","message":"prisma:info Starting a sqlite pool with 21 connections.","service":"backend"}
{"level":"info","message":"prisma:info Started query engine http server on http://127.0.0.1:51049","service":"backend"}
{"level":"info","message":"\u001b[32m[TELEMETRY SENT]\u001b[0m {\"event\":\"server_boot\",\"distinctId\":\"ea8cb903-cdc7-4dbc-898a-f0c70402eefb\",\"properties\":{\"runtime\":\"desktop\"}}","service":"backend"}
{"level":"info","message":"Skipping preloading of AnythingLLMOllama - LLM_PROVIDER is qnnengine.","service":"backend"}
{"level":"info","message":"Hot loading of QnnEngine - LLM_PROVIDER is qnnengine with model llama_v3_2_3b_chat_8k.","service":"backend"}
{"level":"info","message":"\u001b[36m[NativeEmbedder]\u001b[0m Initialized","service":"backend"}
{"level":"info","message":"\u001b[36m[QNN Engine]\u001b[0m Initialized with model: llama_v3_2_3b_chat_8k. Context window: 4096","service":"backend"}
{"level":"info","message":"\u001b[36m[CommunicationKey]\u001b[0m RSA key pair generated for signed payloads within AnythingLLM services.","service":"backend"}
{"level":"info","message":"\u001b[36m[EncryptionManager]\u001b[0m Loaded existing key & salt for encrypting arbitrary data.","service":"backend"}
{"level":"info","message":"[production] AnythingLLM Standalone Backend listening on port 3001. Network discovery is disabled. NPU Detected: false","service":"backend"}
{"level":"info","message":"\u001b[36m[BackgroundWorkerService]\u001b[0m Feature is not enabled and will not be started.","service":"backend"}
{"level":"info","message":"\u001b[36m[QNN Engine]\u001b[0m Boot failure for port 8080","service":"backend"}
{"level":"info","message":"\u001b[36m[NativeEmbedder]\u001b[0m Initialized","service":"backend"}
{"level":"info","message":"\u001b[36m[NativeEmbedder]\u001b[0m Initialized","service":"backend"}
{"level":"info","message":"\u001b[36m[QNN Engine]\u001b[0m Initialized with model: llama_v3_2_3b_chat_8k. Context window: 4096","service":"backend"}
{"level":"info","message":"\u001b[36m[QNN Engine]\u001b[0m Boot failure for port 8080","service":"backend"}
{"level":"error","message":"Error: QNN Engine is offline. Please reboot QNN Engine or AnythingLLM app.\n    at s.checkReady (C:\\Users\\aleks\\AppData\\Local\\Programs\\AnythingLLM\\resources\\backend\\server.js:31:1604)\n    at async s.streamGetChatCompletion (C:\\Users\\aleks\\AppData\\Local\\Programs\\AnythingLLM\\resources\\backend\\server.js:31:2741)\n    at async SL (C:\\Users\\aleks\\AppData\\Local\\Programs\\AnythingLLM\\resources\\backend\\server.js:236:2892)\n    at async C:\\Users\\aleks\\AppData\\Local\\Programs\\AnythingLLM\\resources\\backend\\server.js:236:4507","service":"backend"}

Are there known steps to reproduce?

No response

The text was updated successfully, but these errors were encountered:

lachlanharrisdev · 2025-01-11T00:35:55Z

Same issue on snapdragon x elite for me

timothycarambat · 2025-01-11T01:13:01Z

Same issue on snapdragon x elite for me

Is this after downloading a model? Also have you tried a reboot post-download of the model?

Second, @barealek - I just got confirmation that we can run Elite compiled models on Plus chipsets, so we will patch that and re-release 1.7.2

lachlanharrisdev · 2025-01-11T02:36:44Z

Is this after downloading a model? Also have you tried a reboot post-download of the model?

I downloaded and tried to run a model, choosing the qualcomm LLM provider and NPU embedder, but it came up with the error. It failed to work after fully rebooting the app and restarting my computer, then tried all of the same things after uninstalling and reinstalling the app, which still didn't work.

I haven't done any additional setup of the NPU or anything outside of AnythingLLM, so I'm wondering if there is some driver(s) I'm missing? I'll let the experts figure it out.

timothycarambat · 2025-01-11T04:08:58Z

@lachlanharrisdev - we just pushed a new build for arm64 1.7.2-r2-arm64 (version is located in top right of app window). If you don't have that version installed, download the new build and you should be okay now.

Also what device + chipset are you on? Plus, Elite, etc

lachlanharrisdev · 2025-01-11T05:53:30Z

@timothycarambat I've just installed the new build and it's still failing but it's behaving differently. After I upgraded to the new version and sent a chat, it came up with the error QNN Engine is booting. Please wait for it to finish and try again. I gave it a couple seconds, and sent a chat again, and after ~14 seconds of loading, it came up with the error from before, QNN Engine is offline. Please reboot QNN Engine or AnythingLLM app.

What I noticed is that when booting up AnythingLLM, right before the loading screen switches to the home UI, I can see a task pop up for a split second in task manager called "AnythingLLMQnnEngine", but it seems to end itself very quickly. Same task also pops up after I send a chat, after the QNN Engine "boots", but then again it quickly closes itself.

I'm currently on a Surface Laptop 7 15", running the X elite X1E-80-100.

timothycarambat · 2025-01-11T05:55:48Z

@lachlanharrisdev I wrote this up to debug the engine directly (app should be closed)
https://docs.google.com/document/d/1Uk9WKCXz0a6tuKeWbaoSD1gDUGglBVycNgJBsDZJB2k/edit?usp=sharing

I have the same chipset on a Dell Latitude,

lachlanharrisdev · 2025-01-11T06:38:30Z

@timothycarambat yep, that found the issue

[WARN]  "Unable to initialize logging in the backend."
[ERROR] "Could not initialize backend due to error = 4000"
[ERROR] "Qnn initializeBackend FAILED!"
Failure to initialize model
Error: Failed to create the Genie Dialog.

If it's relevant, this was using llama 3.1 8b, not 3.2 3b.

timothycarambat · 2025-01-11T06:48:41Z

@lachlanharrisdev Now this is a very different issue from other then. If you run the command as administrator does it still fail to initialize? I am wondering how/why you would require admin to execute the LLM engine, but someone else had success with that and I have to determine why that would ever be the case for anyone since that should not be required to start the QNN LLM API.

timothycarambat · 2025-01-11T06:49:12Z

From the recent patch that seemed to solve most issues people had (most Plus support was not enabled) but this is certainly something different

lachlanharrisdev · 2025-01-11T07:13:51Z

If you run the command as administrator does it still fail to initialize?

@timothycarambat nope, running it as admin now works and I do see QNN running on localhost.

[INFO]  "Using create From Binary"
[INFO]  "Allocated total size = 609845760 across 10 buffers"
AnythingLLMQnnEngine API Server: Starting chat API on host 127.0.0.1:8000
Build: 1.0.1 80b0117 Fri Dec 27 13:07:34 2024

I tried running AnythingLLM as administrator and, after QNN engine boots, I can successfully chat. This works for me, but I'm more than happy to keep testing things out for you, I'd love to contribute in any way I can. Should we create a new issue and continue there?

AlphaEcho11 · 2025-01-11T07:52:25Z

@lachlanharrisdev - we just pushed a new build for arm64 1.7.2-r2-arm64 (version is located in top right of app window). If you don't have that version installed, download the new build and you should be okay now.

Build 1.7.2-r2-arm64 seems to be working well while running in Administrator mode.
QNN engine still appears to fail, especially in build 1.7.2-arm64, with an additional Boot failure of port 8080 (on some devices;cannot confirm all SoC in play).

Happy hunting, everyone!

lachlanharrisdev · 2025-01-11T09:23:46Z

@timothycarambat I've just restarted my PC and now it seems to no longer work even with administrator mode... I'm guessing the same QNN Engine instance stayed online from the instructinos in the google doc, and AnythingLLM used that instance instead of booting another one (if that's even possible, I know barely anything about AI and Qualcomm). Hopefull that clears up any confusion.

@AlphaEcho11 interesting, what device are you using? Just wondering if this is only a surface laptop thing

AlphaEcho11 · 2025-01-11T09:49:56Z

@lachlanharrisdev - Surface Pro 11 here, on the X Elite. After several device reboots and AnythingLLM refreshes, it's been working without issue.
What's the output of the backend logs when you have the device rebooted and attempting to get QNN engine up? Curious if it's failing or something further. Thanks in advance!

barealek · 2025-01-11T10:08:53Z

From the recent patch that seemed to solve most issues people had (most Plus support was not enabled) but this is certainly something different

I am still having issues, even when launching as an administrative account. It seems like it's starting up now, I get a message that roughly says "QNN is still booting, please wait", but then it just crashes and the QNN engine goes offline. Here's my logs:
backend-2025-01-11.log

AlphaEcho11 · 2025-01-11T14:47:51Z

From the recent patch that seemed to solve most issues people had (most Plus support was not enabled) but this is certainly something different

I am still having issues, even when launching as an administrative account. It seems like it's starting up now, I get a message that roughly says "QNN is still booting, please wait", but then it just crashes and the QNN engine goes offline. Here's my logs:
backend-2025-01-11.log

Thank you for the logs! Yes, seeing the QNN engine fail to get online here; going to check one more area and see if another variable is at play.

AlphaEcho11 · 2025-01-11T15:39:16Z

From the recent patch that seemed to solve most issues people had (most Plus support was not enabled) but this is certainly something different

I am still having issues, even when launching as an administrative account. It seems like it's starting up now, I get a message that roughly says "QNN is still booting, please wait", but then it just crashes and the QNN engine goes offline. Here's my logs: backend-2025-01-11.log

Can you reattempt this with the 8B model as well? Following @timothycarambat 's previous recommendations and tweaking:

Download & unpack the model
Restart Anything LLM (in administrator mode)
Load up workspace running the Qualcomm QNN engine with the model requested
Watch for NPU performance, check backend log for QNN engine data

Let us know the results!

barealek added the possible bug Bug was reported but is not confirmed or is unable to be replicated. label Jan 10, 2025

timothycarambat self-assigned this Jan 11, 2025

timothycarambat added core-team-only Desktop investigating Core team or maintainer will or is currently looking into this issue labels Jan 11, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG]: "QNN Engine is offline." when using a Snapdragon X Plus laptop #2962

[BUG]: "QNN Engine is offline." when using a Snapdragon X Plus laptop #2962

barealek commented Jan 10, 2025 •

edited

Loading

lachlanharrisdev commented Jan 11, 2025 •

edited

Loading

timothycarambat commented Jan 11, 2025

lachlanharrisdev commented Jan 11, 2025

timothycarambat commented Jan 11, 2025 •

edited

Loading

lachlanharrisdev commented Jan 11, 2025

timothycarambat commented Jan 11, 2025 •

edited

Loading

lachlanharrisdev commented Jan 11, 2025

timothycarambat commented Jan 11, 2025

timothycarambat commented Jan 11, 2025

lachlanharrisdev commented Jan 11, 2025 •

edited

Loading

AlphaEcho11 commented Jan 11, 2025

lachlanharrisdev commented Jan 11, 2025 •

edited

Loading

AlphaEcho11 commented Jan 11, 2025

barealek commented Jan 11, 2025

AlphaEcho11 commented Jan 11, 2025

AlphaEcho11 commented Jan 11, 2025

[BUG]: "QNN Engine is offline." when using a Snapdragon X Plus laptop #2962

[BUG]: "QNN Engine is offline." when using a Snapdragon X Plus laptop #2962

Comments

barealek commented Jan 10, 2025 • edited Loading

How are you running AnythingLLM?

What happened?

Are there known steps to reproduce?

lachlanharrisdev commented Jan 11, 2025 • edited Loading

timothycarambat commented Jan 11, 2025

lachlanharrisdev commented Jan 11, 2025

timothycarambat commented Jan 11, 2025 • edited Loading

lachlanharrisdev commented Jan 11, 2025

timothycarambat commented Jan 11, 2025 • edited Loading

lachlanharrisdev commented Jan 11, 2025

timothycarambat commented Jan 11, 2025

timothycarambat commented Jan 11, 2025

lachlanharrisdev commented Jan 11, 2025 • edited Loading

AlphaEcho11 commented Jan 11, 2025

lachlanharrisdev commented Jan 11, 2025 • edited Loading

AlphaEcho11 commented Jan 11, 2025

barealek commented Jan 11, 2025

AlphaEcho11 commented Jan 11, 2025

AlphaEcho11 commented Jan 11, 2025

barealek commented Jan 10, 2025 •

edited

Loading

lachlanharrisdev commented Jan 11, 2025 •

edited

Loading

timothycarambat commented Jan 11, 2025 •

edited

Loading

timothycarambat commented Jan 11, 2025 •

edited

Loading

lachlanharrisdev commented Jan 11, 2025 •

edited

Loading

lachlanharrisdev commented Jan 11, 2025 •

edited

Loading