You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When extracting data from a large array that contains maps, heap memory in Java keeps growing. This might be caused by a memory leak.
When running the same script with the same data, garbage collection appears to retrieve the used memory. However, each time the code or the data changes, the heap memory usage increases and does not return to the previous level.
Using generated data (see below), the graph is similar:
Expected behaviour
The heap memory usage should return to a lower level after garbage collection. It should not increase permanently after a change in code or data.
To reproduce
The following script is a much simplified version of a script that gets data out of a (large) array containing maps.
In the original script, the data comes from a JSON file, but I get the same results when generating the data in the script:
let $doc asitem() := array
{ for $i in1to500000return map
{ 'id' : 'id'||$i
, 'status' : if ($i mod100 = 0) then'inactive'elseif ($i mod80 = 1) then'withdrawn'else'active'
, 'relationships' : array{ map
{ 'type' : 'Related'
, 'id' : 'id'||($i+1)
}}
}
}
let $doc-size as xs:integer := array:size($doc)
let $ids :=
for $doc-index in1to $doc-size
let $item asmap(*) := $doc($doc-index)
(:let $status := $item?status:)(:let $relationships as array(*)? := $item?relationships:)where $item?status = ('withdrawn','inactive') andexists($item?relationships)
return $item?id
returncount($ids)
At first, I thought that the memory leak (if that is what this is) was in the loop variables $status and $relationships, but that seems not to be the case, so I commented them out.
The second graph above was generated by running this script a few times, than change 500000 in for $i in 1 to 500000 into 500001, run a few times, change to 500002, run a few times, etcetera.
Context
eXist-db: eXist-6.2.0
JVM: OpenJDK 64-Bit Server VM version 11.0.14.1+1
OS: WIndows 10
eXist is run with the launcher (not as a service, although that appears to have the same problem), with memory.max=8192.
More details
I used VisualVM to analyze a heap dump, to get an idea of what takes up all the space in the heap. This suggests that there is a lot in the cache. However, cache:clear() does not change the used heap space.
I am not sure if this gives an indication of what is going on.
The text was updated successfully, but these errors were encountered:
@nverwer The cache that your traces are showing is that of compiled XQuery Modules (and not the Cache XQuery Extension Module that is available via the cache:* functions). When eXist-db compiles a Module and executes it, as compilation is time intensive, after execution, it resets (clears) the state of the Module and stores it into a Caffeine Cache. The next time the same query is executed, instead of recompiling it, it is borrowed from the cache.
It looks like the reset of the module is perhaps not resetting some expressions that accumulated state. We have seen this several times in the past for complex expressions. I did fix a number of issues previously with Maps and Arrays in this area. Could you check if I already fixed this in main by building a 7.0.0-SNAPSHOT? If not, it is possibly another bug in this area that needs to be addressed.
@adamretter Thank you for your response. I compiled the latest 7.0.0-SNAPSHOT and ran the script as shown above.
Unfortunately, heap space usage keeps increasing as I change 500000 into 500001, 500002, and so on.
It looks like this problem is still there. Although I am beginning to understand some of the Java code for eXist, I am afraid I cannot be of much help here.
Description
When extracting data from a large
array
that containsmap
s, heap memory in Java keeps growing. This might be caused by a memory leak.When running the same script with the same data, garbage collection appears to retrieve the used memory. However, each time the code or the data changes, the heap memory usage increases and does not return to the previous level.
The following graph was generated using real data (https://zenodo.org/records/10482057) and comes from
jconsole
:Using generated data (see below), the graph is similar:
Expected behaviour
The heap memory usage should return to a lower level after garbage collection. It should not increase permanently after a change in code or data.
To reproduce
The following script is a much simplified version of a script that gets data out of a (large) array containing maps.
In the original script, the data comes from a JSON file, but I get the same results when generating the data in the script:
At first, I thought that the memory leak (if that is what this is) was in the loop variables
$status
and$relationships
, but that seems not to be the case, so I commented them out.The second graph above was generated by running this script a few times, than change 500000 in
for $i in 1 to 500000
into 500001, run a few times, change to 500002, run a few times, etcetera.Context
eXist-db: eXist-6.2.0
JVM: OpenJDK 64-Bit Server VM version 11.0.14.1+1
OS: WIndows 10
eXist is run with the launcher (not as a service, although that appears to have the same problem), with
memory.max=8192
.More details
I used VisualVM to analyze a heap dump, to get an idea of what takes up all the space in the heap. This suggests that there is a lot in the cache. However,
cache:clear()
does not change the used heap space.I am not sure if this gives an indication of what is going on.
The text was updated successfully, but these errors were encountered: