Skip to content

[bug] Top-level text nodes returned by replace don't correspond to those inserted in the document. #3567

@kivikakk

Description

@kivikakk

This behaviour is a little unexpected to me, and I'm not sure if it constitutes a bug or intended behaviour; I had a look for existing issues to no avail.

Please describe the bug

When using Nokogiri::XML::Node#replace, if text nodes are inserted as replacements at the "top-level" (i.e. the replacement nodeset contain one or more text nodes directly, not within elements), those nodes as returned by #replace don't correspond to the nodes actually inserted into the document. This is unlike elements, or children of those elements (of whichever kind).

Help us reproduce what you're seeing

#!/usr/bin/env ruby
# frozen_string_literal: true

require 'nokogiri'
require 'minitest/autorun'

class Test < Minitest::Spec
  before do
    @doc = Nokogiri::HTML.fragment(<<~HTML)
      Document with text <strong>and element</strong>.
    HTML

    @strong = @doc.css('strong').first
  end

  it 'replaced nodes are present in the new document' do
    @strong.replace('and <em>element</em>') => [new_text, new_html]

    # You can replace the above line with these two; the outcome is the same:
    # replacement = @doc.fragment('and <em>element</em>')
    # @strong.replace(replacement) => [new_text, new_html]

    assert_equal 'and ', new_text.to_html
    assert_equal '<em>element</em>', new_html.to_html

    assert @doc.children.include?(new_html)

    # Fails: they are different nodes.
    assert @doc.children.include?(new_text)
  end

  it 'replaced nodes can be used to manipulate the document' do
    @strong.replace('and <em>element</em>') => [new_text, new_html]
    new_text.remove
    new_html.remove

    # Fails: @doc.to_html is "Document with text and .\n"
    assert_equal "Document with text .\n", @doc.to_html
  end
end

Expected behavior

I'd expect them to be the same as inserted.

Environment

# Nokogiri (1.18.10)
    ---
    warnings: []
    nokogiri:
      version: 1.18.10
      cppflags:
      - "-I/Users/kivikakk/.local/share/mise/installs/ruby/3.3.9/lib/ruby/gems/3.3.0/gems/nokogiri-1.18.10-arm64-darwin/ext/nokogiri"
      - "-I/Users/kivikakk/.local/share/mise/installs/ruby/3.3.9/lib/ruby/gems/3.3.0/gems/nokogiri-1.18.10-arm64-darwin/ext/nokogiri/include"
      - "-I/Users/kivikakk/.local/share/mise/installs/ruby/3.3.9/lib/ruby/gems/3.3.0/gems/nokogiri-1.18.10-arm64-darwin/ext/nokogiri/include/libxml2"
      ldflags: []
    ruby:
      version: 3.3.9
      platform: arm64-darwin24
      gem_platform: arm64-darwin-24
      description: ruby 3.3.9 (2025-07-24 revision f5c772fc7c) [arm64-darwin24]
      engine: ruby
    libxml:
      source: packaged
      precompiled: true
      patches:
      - 0001-Remove-script-macro-support.patch
      - 0002-Update-entities-to-remove-handling-of-ssi.patch
      - '0009-allow-wildcard-namespaces.patch'
      - 0010-update-config.guess-and-config.sub-for-libxml2.patch
      - 0011-rip-out-libxml2-s-libc_single_threaded-support.patch
      - '0019-xpath-Use-separate-static-hash-table-for-standard-fu.patch'
      memory_management: ruby
      iconv_enabled: true
      compiled: 2.13.9
      loaded: 2.13.9
    libxslt:
      source: packaged
      precompiled: true
      patches:
      - 0001-update-config.guess-and-config.sub-for-libxslt.patch
      datetime_enabled: true
      compiled: 1.1.43
      loaded: 1.1.43
    other_libraries:
      zlib: 1.3.1
      libiconv: '1.18'
      libgumbo: 1.0.0-nokogiri

Also reproduced on 3b95a51 (current HEAD):

# Nokogiri (1.19.0.dev)
    ---
    warnings: []
    nokogiri:
      version: 1.19.0.dev
      cppflags:
      - "-I/Volumes/g/sscce/nokogiri/ext/nokogiri"
      - "-I/Volumes/g/sscce/nokogiri/ext/nokogiri/include"
      - "-I/Volumes/g/sscce/nokogiri/ext/nokogiri/include/libxml2"
      ldflags: []
    ruby:
      version: 3.3.9
      platform: arm64-darwin24
      gem_platform: arm64-darwin-24
      description: ruby 3.3.9 (2025-07-24 revision f5c772fc7c) [arm64-darwin24]
      engine: ruby
    libxml:
      source: packaged
      precompiled: false
      patches:
      - 0001-Remove-script-macro-support.patch
      - 0002-Update-entities-to-remove-handling-of-ssi.patch
      - '0009-allow-wildcard-namespaces.patch'
      - 0010-update-config.guess-and-config.sub-for-libxml2.patch
      memory_management: ruby
      iconv_enabled: true
      compiled: 2.14.6
      loaded: 2.14.6
    libxslt:
      source: packaged
      precompiled: false
      patches:
      - 0001-update-config.guess-and-config.sub-for-libxslt.patch
      datetime_enabled: true
      compiled: 1.1.43
      loaded: 1.1.43
    other_libraries:
      libgumbo: 1.0.0-nokogiri

Additional context

I'll try playing around with #replace and see if I can get a fix going.

Additionally, while the inclusion of <em>element/<em>' / new_html` was intended to help contrast the behaviours, it's notable that removing the element and replacing with only text doesn't change things.

Metadata

Metadata

Assignees

No one assigned

    Labels

    state/needs-triageInbox for non-installation-related bug reports or help requests

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions