Bump zenoh to 1.8.0 - 2nd attempt (backport #964)#965
Merged
Conversation
* chore(zenoh_cpp_vendor): bump to latest zenoh-c and zenoh-cpp - zenoh-c main: 102df1a3 (2026-04-10) - zenoh-c ROS/rust-1.75: 0193595c (2026-04-07) - zenoh-cpp main: af381b42 (2026-04-10) * fix: close session explicitly in shutdown() to prevent hang on Windows zenoh commit e5db0ce changed session.close() to call wait_callbacks(), which blocks until all in-flight callbacks finish. With the older teardown order, session_.reset() was called while node-level entities (publishers, subscriptions, etc.) still held shared_ptr<Session> refs, so the session wasn't actually destroyed until ~Data() called nodes_.clear() — at which point wait_callbacks() would deadlock against callbacks being concurrently destroyed on Windows. Fix: call session_->close() explicitly in shutdown() before session_.reset(). At shutdown time the spin loop has already exited, so no callbacks are in-flight and wait_callbacks() returns immediately. The session is then marked closed; when the shared_ptr refcount eventually drops to zero during normal rcl teardown, the session destructor finds is_closed()==true and skips the blocking close(). * chore(zenoh_cpp_vendor): restore get_cargo_version.cmake from #945 Extract cargo version detection into a reusable CMake function instead of inlining execute_process, matching the approach from PR #945. * fix: disable ANSI color codes in Zenoh log output (#951) Set RUST_LOG_STYLE=never before initializing the Zenoh logger so that color escape sequences do not leak into captured command output. This fixes YAML parsing failures in ros2param tests where the ESC character was treated as an unacceptable character. The env var is set with overwrite=0 so callers can still override it. * Use zenoh-c commits for Zenoh 1.8.0 + #2493 * Fix synchronization due to changes in undeclare in zenoh 1.8.0 This commit re-applies changes made in #935 , while keeping the explicit call to session_.close() added in rmw_context_impl_s::shutdown() * Use zenoh 2687c5135 eclipse-zenoh/zenoh@2687c51 from branch https://github.com/eclipse-zenoh/zenoh/tree/suppress-admin-err-message-on-session-close based on 1.8.0 plus few fixes, including removal of a error log at closure causing failure of a ros2cli test * revert disable ANSI color codes in Zenoh log output --------- Co-authored-by: Julien Enoch <julien.e@zettascale.tech> (cherry picked from commit ba1ab30)
Contributor
|
Pulls: #965 |
JEnoch
approved these changes
Apr 13, 2026
Contributor
|
CI failures on Linux-rhel are related to |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
shutdown()before releasing the shared_ptr referencezenoh_cpp_vendorto latest zenoh-c and zenoh-cpp, restoringget_cargo_version.cmakefrom Build against rust >= 1.75 for ROS Lyrical #945Key Changes
rmw_context_impl_s.cpp: callsession_->close()beforesession_.reset()inData::shutdown()zenoh_cpp_vendor/CMakeLists.txt: update to zenoh-c commit from ROS/zenoh-2687c5135 branch; add fallback to rust-1.75-zenoh-2687c5135 branch for Rust < 1.88zenoh_cpp_vendor/get_cargo_version.cmake: restored from Build against rust >= 1.75 for ROS Lyrical #945Root Cause (hang)
eclipse-zenoh/zenoh@e5db0ce changed
Session::close()to callwait_callbacks()internally, blocking until all in-flight callbacks finish. The old teardown order letsession_.reset()run while rmw entities (nodes, subscriptions) still held shared_ptr references. The session was only destroyed later inside~Data()duringnodes_.clear()— at which point callback handlers were being torn down simultaneously, causing a deadlock orSTATUS_STACK_BUFFER_OVERRUNon Windows.The fix calls
session_->close()explicitly inshutdown(), at which pointrclcpp::shutdown()has already exited the spin loop so no callbacks are in-flight.wait_callbacks()returns immediately, and the subsequent destructor path findsis_closed() == trueand skips the blocking call.Root Cause (ANSI codes, #951)
Zenoh 1.8.0 emits a new error log at Session shutdown, when a TCP link is closed at the same time and it fails to send an event to an already removed callback.
The Rust logger (
env_logger) emits ANSI color escape sequences by default. These bled into captured output fromros2 paramcommands, causingyaml.reader.ReaderErrorwhen the output was parsed as YAML.ros2topic.ros2topic.test.test_cli.test_cliis also parsing the test output and failing on this error log.The fix is in Zenoh (commit eclipse-zenoh/zenoh@2687c51), removing those logs.
This PR makes rmw_zenoh to use this commit.
Related
get_cargo_version.cmake)Breaking Changes
None
Did you use Generative AI?
Yes. Claude (claude-sonnet-4-6) via Claude Code was used to assist with root cause analysis, reproducing the bug on Windows, and creating an initial prototype of the changes in this PR.
This is an automatic backport of pull request #964 done by Mergify.