Skip to content

HTMLTriplifier OOM happening #530

Open
Open
@enridaga

Description

@enridaga

Just happened a few times while scraping a website, not sure how to reproduce:

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
	at java.base/java.util.Arrays.copyOf(Arrays.java:3537)
	at java.base/java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:228)
	at java.base/java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:802)
	at java.base/java.lang.StringBuilder.append(StringBuilder.java:246)
	at java.base/java.lang.StringBuilder.append(StringBuilder.java:91)
	at org.jsoup.nodes.Entities.escape(Entities.java:245)
	at org.jsoup.nodes.Attribute.htmlNoValidate(Attribute.java:171)
	at org.jsoup.nodes.Attributes.html(Attributes.java:474)
	at org.jsoup.nodes.Element.outerHtmlHead(Element.java:1778)
	at org.jsoup.nodes.Node$OuterHtmlVisitor.head(Node.java:950)
	at org.jsoup.select.NodeTraversor.traverse(NodeTraversor.java:34)
	at org.jsoup.nodes.Node.outerHtml(Node.java:769)
	at org.jsoup.nodes.Node.outerHtml(Node.java:764)
	at io.github.sparqlanything.html.HTMLTriplifier.populate(HTMLTriplifier.java:310)
	at io.github.sparqlanything.html.HTMLTriplifier.populate(HTMLTriplifier.java:314)
	at io.github.sparqlanything.html.HTMLTriplifier.populate(HTMLTriplifier.java:314)
	at io.github.sparqlanything.html.HTMLTriplifier.populate(HTMLTriplifier.java:314)
	at io.github.sparqlanything.html.HTMLTriplifier.populate(HTMLTriplifier.java:314)
	at io.github.sparqlanything.html.HTMLTriplifier.populate(HTMLTriplifier.java:314)
	at io.github.sparqlanything.html.HTMLTriplifier.populate(HTMLTriplifier.java:314)
	at io.github.sparqlanything.html.HTMLTriplifier.populate(HTMLTriplifier.java:314)
	at io.github.sparqlanything.html.HTMLTriplifier.populate(HTMLTriplifier.java:314)
	at io.github.sparqlanything.html.HTMLTriplifier.populate(HTMLTriplifier.java:314)
	at io.github.sparqlanything.html.HTMLTriplifier.populate(HTMLTriplifier.java:314)
	at io.github.sparqlanything.html.HTMLTriplifier.triplify(HTMLTriplifier.java:219)
	at io.github.sparqlanything.engine.DatasetGraphCreator.triplify(DatasetGraphCreator.java:140)
	at io.github.sparqlanything.engine.DatasetGraphCreator.getDatasetGraph(DatasetGraphCreator.java:65)
	at io.github.sparqlanything.engine.FXWorker.execute(FXWorker.java:77)
	at io.github.sparqlanything.engine.FacadeXOpExecutor.execute(FacadeXOpExecutor.java:114)
	at org.apache.jena.sparql.engine.main.ExecutionDispatch.visit(ExecutionDispatch.java:187)
	at org.apache.jena.sparql.algebra.op.OpService.visit(OpService.java:54)
	at org.apache.jena.sparql.engine.main.ExecutionDispatch.exec(ExecutionDispatch.java:43)

heap was -Xmx 4g

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugSomething isn't working

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions