-
Notifications
You must be signed in to change notification settings - Fork 195
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GraalJS memory leak in Engine #121
Comments
Hi @mdsina, thanks for your question. I've not looked at your code yet, but a very generic answer of the best practice when you want code to be cached in principle is:
http://www.graalvm.org/docs/graalvm-as-a-platform/embed/#enable-source-caching As you don't see any memory leak in your third option, I don't think this is a memory leak in our core JS engine code. Before further investigating, please make sure you properly share the engine, close the contexts, and only put the caching flag on sources that you actually want to be cached. Thanks, |
Thanks for your answer, @wirthi
The one thing that I have is that when I build Context and put to JS bindings some members, like require or another one that are present like a singleton Java object (my example have that) The last one have no problems with memory leaks is obviously. Because Engine exactly keep links to some Context object or what in self. If Engine was destroyed after every invocation so there are no problems |
Hi @mdsina We have fixed a memory leak problem around Objects with lots of (different) properties. This fix has landed in RC14 and might fix the problem you reported. Best, |
Hi @mdsina can you confirm this problem is solved for you by a newer Release (RC14 or later, or GraalVM 19). Thanks, |
I hit what I think was the same problem on rc12 and tried upgrading to rc16 and still saw the memory leak. I haven't tried on v19 yet. |
@wirthi |
Checked this out on graal 19.3.0. The same. |
I believe we are both hitting #268 I am wondering if there is a release plan for GraalVM and if this issue is going to be checked. I can provide some help, eventually. |
Hi, what would be super-helpful for us would be an example to easily reproduce. Ideally just one Java file without any other dependency. From what I read from your descriptions (especially in #268) stems from DynamicObjects/Shapes, i.e. from Objects on the JavaScript side being created. So, in theory, the problem should be reproducible by something as simple as (pseudocode):
Best, |
Hi Christian, Thanks for the help, |
Hi again, I did my best to try to reproduce the leak I see in my application (#268), but without success. What you see in the attached test application is basically what I exactly do in my software but with this test application there seem to be no leak. The only difference is that in my application the returned object is saved in memory for some time, while in the test application the returned object is not really used. But I can't believe that the problem is because of this, since typically these objects are primitive objects. Also, in my original application, if I do not cache the Engine object and I put it in a try-with-resources, I see no leak (but performance are bad) while if I cache it, I start seeing the leak... that I do not see in this test application! In an attempt to work-around the leak, I even cached the Engine object as a SoftReference, so that in case of high memory consuption, the GC could still reclaim it. If I do that, I still see the leak, even though I see via VisualVM that the memory (and the number of related Shape objects) actually goes down a few times due to GC kicking in. I suspect that, since nobody is actually closing the Engine, something remains still laying around somewhere cached in the GraalVM JS implementation... So I am actually clueless... maybe the problem is somewhere else, but I find it hard to believe, if changing literally two lines of code in a single class (caching the Engine and the Source, or not doing it) makes the problem to appear or disappear. It is sad, because GraalVM JS is the script engine that provides so far the best performance I could find out there... |
Hi, I also seem to have GraalVM/JS related memory leak, but with different objects: Here are the first entries of my jmap histogram:
I made sure to call close on my Contexts (I even added a finalizer which checks if it was called), but I still have the JVM going OOM after some time. I will try to gather more information. My idea is to try to check for ThreadLocals using reflection. My assumption would be that something is still held in ThreadLocals. Does anyone perhaps have more/better ideas how to approach this? regards, |
Hi Christian, The only thing that I can say with some reasonable confidence is that, if you close the Engine object, the memory is properly deallocated. Using a SoftReference to an ad-hoc class holding the reference to Engine, and adding a finalizer to close the Engine object when the holder class is marked for GC helps in keeping the memory under control. As I wrote, any attempt to reproduce the behaviour in a unit test failed on my side... :( Best regards, |
Hi, @dariol83 : Perhaps its some kind of code cache in the engine, that is growing? I tried to have changing code in my test, but also failed reproducing the behaviour. Might I ask what are the objects on your heap? (e.g. like shown in jmap -histo) In my application, even though I do
, I still get a growing heap in my application and a memory histogram like posted above. regards, |
Hi Christian, You can have a look here: #268 (comment) Best regards, |
One thing I figured out: For every transaction in my app (for which a create a Context/engine and close it again) I get exactly one entry of com.oracle.truffle.polyglot.HostClassCache and org.graalvm.polyglot.HostAccess, which are never freed. Edit: Turns out, this is what happens when you create a new HostAccess instance for every Engine. Using a static final HostAccess fixes the leak. Strangely, this was not an issue with my test-program. I was doing the same thing there, but had no leak. |
Hi Christian, Could you please post a snippet of code documenting exactly how you would use a cached HostAccess? So that I can test it here as well and confirm or not the finding. Thanks a lot in advance! |
Hi @dariol83 , I simply hold the host-access as a static final variable:
Before I had the following, which was leading to a memory leak:
regards, |
Hi Christian, I see, thank you. In my code I am using a straightforward
and I was getting the leak. I will try to use a custom cached HostAccess and see if something changes. I will report here. Best regards, |
I applied the change, but the leak on the ShapeBasic objects and the related other objects is always there. |
@dariol83 : Keeping an eye on this for a while now: On our production I seem to have the same issue (also lots of ShapeBasic objects). But it only there. I was not able reproduce it locally (even though I tried with the same data & requests). Perhaps it only happens when multiple threads are involved (which I did not test so far)? Have you tested such scenarios, or perhaps others ? |
Hi @dariol83 thanks for your reproducer application (that does not reproduce the problem, unfortunately). I had a quick look at it and can confirm I don't see any immediate problem there. But let me try a few remarks:
My best guess is around JavaScript objects and Shapes being created. Thus, to get your example into a form where it exposes the problem, can you maybe try to add returning the object as you state, and see if that changes anything? In the current example code, you are not creating any objects dynamically. Best, |
Hi @wirthi My apologies that I am not progressing on this investigation, I had personal issues recently but I will try to have a new analysis starting from your suggestions:
I will keep digging here. Thanks for all the valid points! Best regards, |
Hi, I observed similar symptoms to the ones described by @horschi. Also his solution worked in my case (btw thanks Christian!). I prepared a reproducer project for this memory leak Regards, |
@xardasos Glad I could help :-) Since I made the Hostaccess static, things have been smooth. I see you are using graal version 19... in your test. Have you tried with newer versions also? Edit: whatever issue I am having in 21.0.0.2, I cannot reproduce it in a simple project. |
I also tried to run my test with 20.3.2 and 21.1.0 (java 11), the leak is still there. I also checked this 20.3.1.2 version (java 11), but the result is the same. |
Hi @xardasos thanks for your example. Regarding the You create an Finally, is there a specific reason why you use
6.6 MB heap usage after ~300.000 iterations of your loop. Edit: moving the Best, |
I would like to chime in with my observations. I'm using the latest Graal. For my tests I used to not use the JIT compiler but rather the "default" settings (the interpreter) and all was fine. The latest version (from a few days ago) started outputting some warnings so I enabled JIT compilation for the tests ( I use a single Engine, create a new Context with it for every execution (using a try-with-resources block to close it after that) and use shared Source objects. At the end of the tests the engine is closed. Digging in I noticed the following: even though the engine is closed, there are a few threads named Perhaps that's one source of memory "leakage" that people observe. It is indeed cleared after a GC run but why don't the compiler threads stop immediately after I close all Contexts and the Engine? Otherwise that's kind of useless - I stop the Engine, my other tests continue and I get an OOM because Graal is still compiling stuff I won't need and my memory is exhausted. 😄 |
I have the same observation as boris-petrov, I have a rest API that call Graal js script engine. I also used a single Engine, create a new Context with it for every execution, and shared source object. I ran a test with 100 concurrent connections to the API. It was very strange that the heap size shoot up to 2.5GB only after all the call was complete, and after garbage collection after a few a min the heap size shoot up again to 4G. As it is a rest API, it absolutely no other threads should consume the heap space. Tried on the latest Graalvm 21.2.0 version, it still have this issue. Running on Java 11 with Graal JavaScript engine lib (https://mvnrepository.com/artifact/org.graalvm.js/js) does not have such issue, I am using this solution for now, but I hope this issue could be resolved so that I can have improved performance on my application. |
is there any update on this issue please? Tried store Java with upgrade module path method to enable graal compiler, still got the same memory leak issue. Not providing upgrade module path will not have memory leak, but the performance is worst than Nashorn. |
Hi @limanhei
We will most likely only be able to help you if you provide an executable example that exhibits the behavior, best in a minified version that has the code around the Only thing I can see in your heap snapshot is that the GraalVM compiler is still active (lots of In general, we are not aware of any open memory leaks, so we don't have any clue what to investigate. As I have written above, typically we find that the Context or Engine API is used wrongly so that it is harder or impossible for the garbage collector to clean up. So seeing your actual usage of the API might give us a hint what is wrong.
The best possible solution is to use a GraalVM directly; then you should not have any troubles with setting the correct paths. If that is not possible, https://www.graalvm.org/reference-manual/js/RunOnJDK/ and https://github.com/graalvm/graal-js-jdk11-maven-demo should show you how to properly set up Graal.js and the GraalVM compiler on a stock JDK installation. Best, |
Hi Christian, The javascript codebase itself is 2.4M large, which consist of 373 functions in 14364 lines. Does it consider to be large? I tried to replicate the issue in my own code but due to security policy I couldn't create the minified version with this javascript codebase. Hope you can provide some optimization for large codebase so that I can move my application to graalvm. Thanks a lot for your help! The way I eval the source private val engine = Engine.create() init { fun execute(executionObj) { |
Hi @limanhei that piece of code looks fine. You are using a shared engine, you create a new Context each time, you use a cached source code object. That should trigger source code caching and avoid recompilation problem. Maybe you can call your
Each line means that a certain part of the guest-language program was compiled. This will happen a lot initially, but should happen less and less frequently. While this compilation is going on, memory usage will be higher. Once there is no more compilation, memory usage should be low (and except of some overhead, represent whatever your application needs). With 2.4mb source code, that should take a few seconds, maybe up to a minute or two - but that highly depends on the actual patterns used in your code, how much of the code is execute repeatedly, how much is only initialization code, etc. Maybe let it run until there are no more Two things additional things to learn from that output:
(Minor) I don't think you strictly need the Best, |
Hi Christian, I saw 1 opt failed. GraphTooBigBailoutException: Graph too big to safely compile. Node count: 400002. Limit: 400000. The compilation only started when I stopped the load test. Except the above failed compilation, others are opt done. After a while, the compilation still continue, and then the container got killed. |
I think the problem I encountered is that the JVMCI is actually using memory outside JVM, so increasing the memory of the container without lowering the MaxRamPercentage has minimal effect to the issue. After I increased container memory from 2G to 4G and lowered MaxRamPercentage to 50 percent I think it is stable. However, I am not sure how much memory is required for JVMCI and it left a risk of oomkill, is there anyway to limit its memory usage? |
Hi,
That it not critical - it means, that our compiler was not able to compile that method as it became to big. That could have a performance impact if it is a frequently executed method, but does not cause any correctness or memory issues. Can you state what method is reported to be affected? If that was from a public library (or you could share the relevant code), we might be able to look into the problem.
I am not sure what exactly you mean with that, but in general, compilation of JavaScript methods should begin as soon as you execute them repeatedly. Maybe you just execute initialization code before, that does not contain any repeated function calls or loops in JavaScript? What also could be the case here is that compilation happens on a separate thread. If you saturate the available cores with other high-priority threads, the compilation threads would starve and not have a chance to do their job.
Because it ran out of memory?
I believe you have two crucial questions: how much memory do you require max, while JIT-compilation is happening, and how much do you require for executing the application after the JIT compiler has done its job. While the first number will be crucial as you need to provide at least as much, the second number should be much lower, as the compiler and all the data and memory it requires should eventually disappear. Maybe you can tweak the number of compiler threads and thus trade in compilation time with memory requirements, if memory is crucial for you? Again, all this is theoretical discussion. To really help you, we'd need to see the source of your application or at least have some insight into your architecture, heap dumps, compilation logs, etc. Christian |
Thanks Christian! All js code are developed by us, most of them are just if then else, there is nothing complicated. We are running the application in a pod so need to know how much memory need to be allocated, otherwise may easily fall into oomkill. To my observation, the JIT compiler seems using memory outside the jvm, correct me if I am wrong, so when I allocate memory to the pod I need to take this into account. Another observation I have in my load test is, when I test it with 1 thread, the performance in graalvm is better than hotspot + graal js interpreter. However, if I test it with 75 client threads, hotspot vm + interpreter had better performance, not sure if it is due to the fact that the script engine is shared. |
…bject Fixes this memory leak by making the HostAccess for the GraalJSScriptEngine available in a static final variable instead of creating it for each new engine. Solution proposed in oracle/graaljs#121 (comment). Sharing a single engine across all Contexts (as proposed in oracle/graaljs#121 (comment)) is not possible, because core expects a ScriptEngine. Signed-off-by: Florian Hotze <[email protected]>
* [jsscripting] Fix memory-leak caused by com.oracle.truffle.host.HostObject Fixes this memory leak by making the HostAccess for the GraalJSScriptEngine available in a static final variable instead of creating it for each new engine. Solution proposed in oracle/graaljs#121 (comment). Sharing a single engine across all Contexts (as proposed in oracle/graaljs#121 (comment)) is not possible, because core expects a ScriptEngine. Signed-off-by: Florian Hotze <[email protected]> * [jsscripting] Update JavaDoc Signed-off-by: Florian Hotze <[email protected]> * [jsscripting] Close `GraalJSScriptEngine` when `OpenhabGraalJSScriptEngine` is closed My breakpoint inside the close method of GraalJSScriptEngine did not trigger until this change was made. Signed-off-by: Florian Hotze <[email protected]>
* [jsscripting] Fix memory-leak caused by com.oracle.truffle.host.HostObject Fixes this memory leak by making the HostAccess for the GraalJSScriptEngine available in a static final variable instead of creating it for each new engine. Solution proposed in oracle/graaljs#121 (comment). Sharing a single engine across all Contexts (as proposed in oracle/graaljs#121 (comment)) is not possible, because core expects a ScriptEngine. Signed-off-by: Florian Hotze <[email protected]> * [jsscripting] Update JavaDoc Signed-off-by: Florian Hotze <[email protected]> * [jsscripting] Close `GraalJSScriptEngine` when `OpenhabGraalJSScriptEngine` is closed My breakpoint inside the close method of GraalJSScriptEngine did not trigger until this change was made. Signed-off-by: Florian Hotze <[email protected]> Signed-off-by: Ben Rosenblum <[email protected]>
* [jsscripting] Fix memory-leak caused by com.oracle.truffle.host.HostObject Fixes this memory leak by making the HostAccess for the GraalJSScriptEngine available in a static final variable instead of creating it for each new engine. Solution proposed in oracle/graaljs#121 (comment). Sharing a single engine across all Contexts (as proposed in oracle/graaljs#121 (comment)) is not possible, because core expects a ScriptEngine. Signed-off-by: Florian Hotze <[email protected]> * [jsscripting] Update JavaDoc Signed-off-by: Florian Hotze <[email protected]> * [jsscripting] Close `GraalJSScriptEngine` when `OpenhabGraalJSScriptEngine` is closed My breakpoint inside the close method of GraalJSScriptEngine did not trigger until this change was made. Signed-off-by: Florian Hotze <[email protected]> Signed-off-by: Andras Uhrin <[email protected]>
…alJSScriptEngine instances See oracle/graaljs#121 (comment), it is not required to have one engine per GraalJSScriptEngine. This might improve performance a bit on less powerful systems (Raspberry Pi) and decreases heap usage: With 5 GraalJS UI scripts, heap usage is now below 100 MB. Before this change, it was over 100 MB. Signed-off-by: Florian Hotze <[email protected]>
* [jsscripting] Share org.graalvm.polyglot.Engine across all OpenhabGraalJSScriptEngine instances See oracle/graaljs#121 (comment), it is not required to have one engine per GraalJSScriptEngine. This might improve performance a bit on less powerful systems (Raspberry Pi) and decreases heap usage: With 5 GraalJS UI scripts, heap usage is now below 100 MB. Before this change, it was over 100 MB. * [jsscripting] Extend debug logging * [jsscripting] Cache `@jsscripting-globals.js` across all engines Signed-off-by: Florian Hotze <[email protected]>
* [jsscripting] Fix memory-leak caused by com.oracle.truffle.host.HostObject Fixes this memory leak by making the HostAccess for the GraalJSScriptEngine available in a static final variable instead of creating it for each new engine. Solution proposed in oracle/graaljs#121 (comment). Sharing a single engine across all Contexts (as proposed in oracle/graaljs#121 (comment)) is not possible, because core expects a ScriptEngine. Signed-off-by: Florian Hotze <[email protected]> * [jsscripting] Update JavaDoc Signed-off-by: Florian Hotze <[email protected]> * [jsscripting] Close `GraalJSScriptEngine` when `OpenhabGraalJSScriptEngine` is closed My breakpoint inside the close method of GraalJSScriptEngine did not trigger until this change was made. Signed-off-by: Florian Hotze <[email protected]>
* [jsscripting] Share org.graalvm.polyglot.Engine across all OpenhabGraalJSScriptEngine instances See oracle/graaljs#121 (comment), it is not required to have one engine per GraalJSScriptEngine. This might improve performance a bit on less powerful systems (Raspberry Pi) and decreases heap usage: With 5 GraalJS UI scripts, heap usage is now below 100 MB. Before this change, it was over 100 MB. * [jsscripting] Extend debug logging * [jsscripting] Cache `@jsscripting-globals.js` across all engines Signed-off-by: Florian Hotze <[email protected]>
* [jsscripting] Fix memory-leak caused by com.oracle.truffle.host.HostObject Fixes this memory leak by making the HostAccess for the GraalJSScriptEngine available in a static final variable instead of creating it for each new engine. Solution proposed in oracle/graaljs#121 (comment). Sharing a single engine across all Contexts (as proposed in oracle/graaljs#121 (comment)) is not possible, because core expects a ScriptEngine. Signed-off-by: Florian Hotze <[email protected]> * [jsscripting] Update JavaDoc Signed-off-by: Florian Hotze <[email protected]> * [jsscripting] Close `GraalJSScriptEngine` when `OpenhabGraalJSScriptEngine` is closed My breakpoint inside the close method of GraalJSScriptEngine did not trigger until this change was made. Signed-off-by: Florian Hotze <[email protected]>
* [jsscripting] Fix memory-leak caused by com.oracle.truffle.host.HostObject Fixes this memory leak by making the HostAccess for the GraalJSScriptEngine available in a static final variable instead of creating it for each new engine. Solution proposed in oracle/graaljs#121 (comment). Sharing a single engine across all Contexts (as proposed in oracle/graaljs#121 (comment)) is not possible, because core expects a ScriptEngine. Signed-off-by: Florian Hotze <[email protected]> * [jsscripting] Update JavaDoc Signed-off-by: Florian Hotze <[email protected]> * [jsscripting] Close `GraalJSScriptEngine` when `OpenhabGraalJSScriptEngine` is closed My breakpoint inside the close method of GraalJSScriptEngine did not trigger until this change was made. Signed-off-by: Florian Hotze <[email protected]>
* [jsscripting] Share org.graalvm.polyglot.Engine across all OpenhabGraalJSScriptEngine instances See oracle/graaljs#121 (comment), it is not required to have one engine per GraalJSScriptEngine. This might improve performance a bit on less powerful systems (Raspberry Pi) and decreases heap usage: With 5 GraalJS UI scripts, heap usage is now below 100 MB. Before this change, it was over 100 MB. * [jsscripting] Extend debug logging * [jsscripting] Cache `@jsscripting-globals.js` across all engines Signed-off-by: Florian Hotze <[email protected]>
* [jsscripting] Share org.graalvm.polyglot.Engine across all OpenhabGraalJSScriptEngine instances See oracle/graaljs#121 (comment), it is not required to have one engine per GraalJSScriptEngine. This might improve performance a bit on less powerful systems (Raspberry Pi) and decreases heap usage: With 5 GraalJS UI scripts, heap usage is now below 100 MB. Before this change, it was over 100 MB. * [jsscripting] Extend debug logging * [jsscripting] Cache `@jsscripting-globals.js` across all engines Signed-off-by: Florian Hotze <[email protected]>
* [jsscripting] Fix memory-leak caused by com.oracle.truffle.host.HostObject Fixes this memory leak by making the HostAccess for the GraalJSScriptEngine available in a static final variable instead of creating it for each new engine. Solution proposed in oracle/graaljs#121 (comment). Sharing a single engine across all Contexts (as proposed in oracle/graaljs#121 (comment)) is not possible, because core expects a ScriptEngine. Signed-off-by: Florian Hotze <[email protected]> * [jsscripting] Update JavaDoc Signed-off-by: Florian Hotze <[email protected]> * [jsscripting] Close `GraalJSScriptEngine` when `OpenhabGraalJSScriptEngine` is closed My breakpoint inside the close method of GraalJSScriptEngine did not trigger until this change was made. Signed-off-by: Florian Hotze <[email protected]> Signed-off-by: Andras Uhrin <[email protected]>
Hello.
I'm creating service like FaaS for JS scripts with limited API and found memory leak issues on prototyping different approaches while using Graal Polyglot.
Approaches that I've trying to implement:
Any other invocations just execute existing member and nothing more
After that getting function member from Context and execute (as in previous approach)
Only third approach have no memory leaks, because GC will grab them out from heap.
In first and second approaches are memory leaks. Looks like Engine have issues on executing some code and generates much data for each invocation, that will be never removed from Engine.
I have a prototype that reproduces the first approach:
https://github.com/mdsina/graaljs-executor-web-service
Also I wrote a load tests that simulate real-world invocations of JS code:
https://github.com/mdsina/graaljs-executor-web-service-load-tests/tree/master
Service was executed with VM options:
Looks like links to ContextJS never be removed.
The text was updated successfully, but these errors were encountered: