Skip to content

[bug] Document Fragment #clone does not behave like #dup with respect to the owning document #2908

@flavorjones

Description

@flavorjones

Please describe the bug

While working on upgrading Action Text to HTML5, some tests were failing that I traced back to this repro:

class Minitest::Spec
  describe "HTML4" do
    let(:document) { Nokogiri::HTML4::Document.parse("<div>doc</div>") }
    let(:fragment) { document.fragment("<div>frag</div>") }

    it "#dup makes a copy of the fragment with a new document" do
      dup = fragment.dup
      refute_nil(dup.document)
      refute_same(dup.document, fragment.document)
    end

    it "#clone makes a copy of the fragment with a new document" do
      clone = fragment.clone
      refute_nil(clone.document)
      refute_same(clone.document, fragment.document) ######## this fails ########
    end
  end

  describe "HTML5" do
    let(:document) { Nokogiri::HTML5::Document.parse("<div>doc</div>") }
    let(:fragment) { document.fragment("<div>frag</div>") }

    it "#dup makes a copy of the fragment with a new document" do
      dup = fragment.dup
      refute_nil(dup.document)
      refute_same(dup.document, fragment.document)
    end

    it "#clone makes a copy of the fragment with a new document" do
      clone = fragment.clone
      refute_nil(clone.document) ######## this fails ########
      refute_same(clone.document, fragment.document)
    end
  end
end

fails with

  1) Failure:
HTML4#test_0002_#clone makes a copy of the fragment with a new document [../rails/nokogiri-clone-bug.rb:54]:
Expected #<Nokogiri::HTML4::Document:0x67c name="document" children=[#<Nokogiri::XML::DTD:0x618 name="html">, #<Nokogiri::XML::Element:0x668 name="html" children=[#<Nokogiri::XML::Element:0x654 name="body" children=[#<Nokogiri::XML::Element:0x640 name="div" children=[#<Nokogiri::XML::Text:0x62c "doc">]>]>]>]> (oid=1660) to not be the same as #<Nokogiri::HTML4::Document:0x67c name="document" children=[#<Nokogiri::XML::DTD:0x618 name="html">, #<Nokogiri::XML::Element:0x668 name="html" children=[#<Nokogiri::XML::Element:0x654 name="body" children=[#<Nokogiri::XML::Element:0x640 name="div" children=[#<Nokogiri::XML::Text:0x62c "doc">]>]>]>]> (oid=1660).

  2) Failure:
HTML5#test_0002_#clone makes a copy of the fragment with a new document [../rails/nokogiri-clone-bug.rb:70]:
Expected nil to not be nil.

Specifically:

  • HTML4::DocumentFragment#clone re-uses the original's Document
  • HTML5::DocumentFragment#clone does not set a document at all!

Expected behavior

The desired behavior matches #dup for node and node-like things: to make a copy of the parent document to avoid accumulating nodes that cannot be freed until the parent document is freed (see #1063 and #1834).

Please also note, while we're in here, that there are expected differences between #dup and #clone that we're also not honoring, see #316 for a description of those problems (essentially, the singleton class). If we tackle this issue, we should tackle that one as well.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions