-
-
Notifications
You must be signed in to change notification settings - Fork 930
Description
This behaviour is a little unexpected to me, and I'm not sure if it constitutes a bug or intended behaviour; I had a look for existing issues to no avail.
Please describe the bug
When using Nokogiri::XML::Node#replace, if text nodes are inserted as replacements at the "top-level" (i.e. the replacement nodeset contain one or more text nodes directly, not within elements), those nodes as returned by #replace don't correspond to the nodes actually inserted into the document. This is unlike elements, or children of those elements (of whichever kind).
Help us reproduce what you're seeing
#!/usr/bin/env ruby
# frozen_string_literal: true
require 'nokogiri'
require 'minitest/autorun'
class Test < Minitest::Spec
before do
@doc = Nokogiri::HTML.fragment(<<~HTML)
Document with text <strong>and element</strong>.
HTML
@strong = @doc.css('strong').first
end
it 'replaced nodes are present in the new document' do
@strong.replace('and <em>element</em>') => [new_text, new_html]
# You can replace the above line with these two; the outcome is the same:
# replacement = @doc.fragment('and <em>element</em>')
# @strong.replace(replacement) => [new_text, new_html]
assert_equal 'and ', new_text.to_html
assert_equal '<em>element</em>', new_html.to_html
assert @doc.children.include?(new_html)
# Fails: they are different nodes.
assert @doc.children.include?(new_text)
end
it 'replaced nodes can be used to manipulate the document' do
@strong.replace('and <em>element</em>') => [new_text, new_html]
new_text.remove
new_html.remove
# Fails: @doc.to_html is "Document with text and .\n"
assert_equal "Document with text .\n", @doc.to_html
end
endExpected behavior
I'd expect them to be the same as inserted.
Environment
# Nokogiri (1.18.10)
---
warnings: []
nokogiri:
version: 1.18.10
cppflags:
- "-I/Users/kivikakk/.local/share/mise/installs/ruby/3.3.9/lib/ruby/gems/3.3.0/gems/nokogiri-1.18.10-arm64-darwin/ext/nokogiri"
- "-I/Users/kivikakk/.local/share/mise/installs/ruby/3.3.9/lib/ruby/gems/3.3.0/gems/nokogiri-1.18.10-arm64-darwin/ext/nokogiri/include"
- "-I/Users/kivikakk/.local/share/mise/installs/ruby/3.3.9/lib/ruby/gems/3.3.0/gems/nokogiri-1.18.10-arm64-darwin/ext/nokogiri/include/libxml2"
ldflags: []
ruby:
version: 3.3.9
platform: arm64-darwin24
gem_platform: arm64-darwin-24
description: ruby 3.3.9 (2025-07-24 revision f5c772fc7c) [arm64-darwin24]
engine: ruby
libxml:
source: packaged
precompiled: true
patches:
- 0001-Remove-script-macro-support.patch
- 0002-Update-entities-to-remove-handling-of-ssi.patch
- '0009-allow-wildcard-namespaces.patch'
- 0010-update-config.guess-and-config.sub-for-libxml2.patch
- 0011-rip-out-libxml2-s-libc_single_threaded-support.patch
- '0019-xpath-Use-separate-static-hash-table-for-standard-fu.patch'
memory_management: ruby
iconv_enabled: true
compiled: 2.13.9
loaded: 2.13.9
libxslt:
source: packaged
precompiled: true
patches:
- 0001-update-config.guess-and-config.sub-for-libxslt.patch
datetime_enabled: true
compiled: 1.1.43
loaded: 1.1.43
other_libraries:
zlib: 1.3.1
libiconv: '1.18'
libgumbo: 1.0.0-nokogiri
Also reproduced on 3b95a51 (current HEAD):
# Nokogiri (1.19.0.dev)
---
warnings: []
nokogiri:
version: 1.19.0.dev
cppflags:
- "-I/Volumes/g/sscce/nokogiri/ext/nokogiri"
- "-I/Volumes/g/sscce/nokogiri/ext/nokogiri/include"
- "-I/Volumes/g/sscce/nokogiri/ext/nokogiri/include/libxml2"
ldflags: []
ruby:
version: 3.3.9
platform: arm64-darwin24
gem_platform: arm64-darwin-24
description: ruby 3.3.9 (2025-07-24 revision f5c772fc7c) [arm64-darwin24]
engine: ruby
libxml:
source: packaged
precompiled: false
patches:
- 0001-Remove-script-macro-support.patch
- 0002-Update-entities-to-remove-handling-of-ssi.patch
- '0009-allow-wildcard-namespaces.patch'
- 0010-update-config.guess-and-config.sub-for-libxml2.patch
memory_management: ruby
iconv_enabled: true
compiled: 2.14.6
loaded: 2.14.6
libxslt:
source: packaged
precompiled: false
patches:
- 0001-update-config.guess-and-config.sub-for-libxslt.patch
datetime_enabled: true
compiled: 1.1.43
loaded: 1.1.43
other_libraries:
libgumbo: 1.0.0-nokogiri
Additional context
I'll try playing around with #replace and see if I can get a fix going.
Additionally, while the inclusion of <em>element/<em>' / new_html` was intended to help contrast the behaviours, it's notable that removing the element and replacing with only text doesn't change things.