Skip to content

[bug] Memory is leaked when raising an exception from a custom XPath function, or from some SAX handler callbacks #2097

@flavorjones

Description

@flavorjones

Please describe the bug

If I'm using a custom XPath function and I raise an exception within my handler, memory is leaked.

Help us reproduce what you're seeing

Run this script and watch the process's memory utilization grow:

#! /usr/bin/env ruby

require 'nokogiri'

loop do
  doc = Nokogiri::XML.parse("<foo></foo>")
  begin
    doc.xpath('//foo[exceptional()]', Class.new {
                def exceptional()
                  raise "ONOES"
                end
              }.new)
    exit 1 # should never be reached
  rescue => e
    puts e
  end
end

Expected behavior

Memory shouldn't be leaking.

Environment

# Nokogiri (1.10.10)
    ---
    warnings: []
    nokogiri: 1.10.10
    ruby:
      version: 2.7.0
      platform: x86_64-linux
      description: ruby 2.7.0p0 (2019-12-25 revision 647ee6f091) [x86_64-linux]
      engine: ruby
    libxml:
      binding: extension
      source: packaged
      libxml2_path: "/home/flavorjones/.rvm/gems/ruby-2.7.0/gems/nokogiri-1.10.10/ports/x86_64-pc-linux-gnu/libxml2/2.9.10"
      libxslt_path: "/home/flavorjones/.rvm/gems/ruby-2.7.0/gems/nokogiri-1.10.10/ports/x86_64-pc-linux-gnu/libxslt/1.1.34"
      libxml2_patches:
      - 0001-Revert-Do-not-URI-escape-in-server-side-includes.patch
      - 0002-Remove-script-macro-support.patch
      - 0003-Update-entities-to-remove-handling-of-ssi.patch
      - 0004-libxml2.la-is-in-top_builddir.patch
      - 0005-Fix-infinite-loop-in-xmlStringLenDecodeEntities.patch
      libxslt_patches: []
      compiled: 2.9.10
      loaded: 2.9.10

Additional context

I discovered this while working on #1610. It's not clear to me yet (I haven't taken the time to look) whether libxml2 is leaking this memory or if it's something in Ruby-space; but I suspect it's libxml2 not being able to clean up the XPath context because we've longjmped over it.

In which case, we may need to do something more invasive to wrap the calls back into Ruby space in a rescue block.

I'd also be curious to see if the following functionality is also susceptible to the same class of issue (raising an exception from a Ruby callback):

  • SAX parsing callbacks
  • XSLT registered transformation functions

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions