-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tree sitter memory leak when using the YAML parser in an extension #24742
Comments
Also jfyi: the memory leak has been confirmed to be occurring on both macOS and Linux. |
Thanks for reporting! We'll look into this, as there should be no way a rogue extension can bring Zed down |
Ok, I can reproduce this. It looks like Tree-sitter is going into an infinite loop due to a buggy external scanner in the @kartikvashistha I think that your external scanner is producing zero-length I've added protection against this bug in tree-sitter/tree-sitter#4213; Tree-sitter will now ignore empty external tokens that are considered extra. |
Thanks a lot for having a look at and working on this @maxbrunsfeld! Just to be clear, the external scanner currently used by the Ansible extension should be the same as the one being used by zed's tree sitter yaml. What I never understood is why this wasn't an issue with normal YAML files, which use the same exact scanner 🤔 Additionally, when you get the time, would you mind briefly telling how you proceeded to debug this? I have spent a lot of time trying to debug this and couldn't really understand how to even proceed lol 😅 |
Thanks for pointing that out. I didn't realize they were exactly the same code. I think there is some undefined behavior in the YAML parser such that it behaves differently when loaded via WASM. Investigating... |
Ok, the YAML parser is a bit weird, but I was able to track down the difference in behavior between the WASM and non-WASM code paths, and will fix that in Tree-sitter too. That way, your wasm parser should behave identically to the native YAML one (and won't have the infinite loop described above). |
This fix will go out in Zed 0.175. @kartikvashistha Thanks so much for the reproduction info, and sorry you had to deal with this for so long, while trying to maintain the Ansible extension. |
No worries at all @maxbrunsfeld! Thanks a bunch for the speedy resolution!🥳 |
Summary
Hi team, I'm the author of the Ansible Extension.
For the past few months, we have been facing an issue caused by tree-sitter highlighting wherein it breaks highlighting for a given Ansible file. The memory issue specifically occurs when a text insert is initiated either just before a comment
#
or---
(eg: shift+o when using vim mode on the mentioned symbols). This results in highlighting getting broken, with a memory leak caused specifically byts_malloc_default
from tree_sitter, which is reset only after either the editor is killed by an OOM service or manually closed.One rather odd behaviour that we noticed was that when we tried to re-use a yaml parser for Ansible files, the memory leak started affecting non Ansible YAML files as well. Flummoxed, I created an identical YAML parser for Ansible, with the only difference being the name, and lo-behold this solved the memory leak issue for all non Ansible YAMl files when the extension was installed (and limited the memory leak to only files detected as Ansible files, when inserting text above
#
or---
).I ran heaptrack to better find out where this issue was occurring, and my best guess is that the syntax map gets broken when taking a snapshot is either because the language name isn't sent correctly to
ts_parser_parse
or due to some speed issue when retrieving the name of the grammar which might be mitigated by sayts_parser_set_timeout_micros
orset_sync_parse_timeout
(under Buffer implementation).(Please note: this is only my current best guess, I have no idea how right this is just yet.)
data:image/s3,"s3://crabby-images/85e07/85e07331dfdfea8faa85c1d9aae484fe8155f5b8" alt="Image"
The issue goes away completely if we remove the grammar table from
extension.toml
and re-use the internal YAML grammar, but unfortunately as you might be aware, the extension cli doesn't allow for this. This works fine only when installing the extension locally.Any and all help here will be greatly appreciated. Thanks!
Steps to trigger the problem:
---
, and observe the memory leak on a system monitor tool of your choice.Actual Behavior:
Highlighting breaks & Zed gobbles up all of the memory.
Expected Behavior:
No memory leak occurs.
Link to the heaptrack zst file: https://www.dropbox.com/scl/fi/7m8q4qbvshrzdka60tbxg/heaptrack.zed.109104.zst?rlkey=oj0e92m7kbt3vof54em41kfwm&st=zangv4eq&dl=0
Zed Version and System Specs
Zed: v0.172.11 (Zed)
OS: Linux Wayland Fedora 41
Memory: 31.2 GiB
Architecture: x86_64
GPU: NVIDIA GeForce RTX 2080 SUPER || NVIDIA || 565.77
The text was updated successfully, but these errors were encountered: