fix(yaml): Fix stack overflow in deepHashCode for self-referential lists#2362
fix(yaml): Fix stack overflow in deepHashCode for self-referential lists#2362kevmoo wants to merge 4 commits into
Conversation
kevmoo
commented
Apr 5, 2026
- Use deepHashCodeInner for iterables to preserve cycle detection state.
- Optimize cycle detection in deepHashCode by using a manual loop instead of .any() to avoid closure allocations.
- Add regression test for self-referential structures.
- Add a new benchmark for recursive structures.
- Use deepHashCodeInner for iterables to preserve cycle detection state. - Optimize cycle detection in deepHashCode by using a manual loop instead of .any() to avoid closure allocations. - Add regression test for self-referential structures. - Add a new benchmark for recursive structures.
Package publishing
Documentation at https://github.com/dart-lang/ecosystem/wiki/Publishing-automation. |
PR HealthChangelog Entry ✔️
Changes to files need to be accounted for in their respective changelogs. This check can be disabled by tagging the PR with
Coverage
|
| File | Coverage |
|---|---|
| pkgs/yaml/benchmark/recursive_benchmark.dart | 💔 Not covered |
| pkgs/yaml/lib/src/equality.dart | 💔 Not covered |
This check for test coverage is informational (issues shown here will not fail the PR).
This check can be disabled by tagging the PR with skip-coverage-check.
Unused Dependencies ✔️
| Package | Status |
|---|---|
| yaml | ✔️ All dependencies utilized correctly. |
For details on how to fix these, see dependency_validator.
This check can be disabled by tagging the PR with skip-unused-dependencies-check.
Breaking changes ✔️
| Package | Change | Current Version | New Version | Needed Version | Looking good? |
|---|---|---|---|---|---|
| yaml | None | 3.1.3 | 3.1.4-wip | 3.1.4-wip | ✔️ |
This check can be disabled by tagging the PR with skip-breaking-check.
API leaks ⚠️
The following packages contain symbols visible in the public API, but not exported by the library. Export these symbols or remove them from your publicly visible API.
| Package | Leaked API symbol | Leaking sources |
|---|---|---|
| yaml | ErrorListener | yaml/yaml.dart::loadYaml::errorListener yaml/yaml.dart::loadYamlNode::errorListener yaml/yaml.dart::loadYamlDocument::errorListener |
| yaml | ScalarEvent | yaml_node.dart::YamlScalar::internal::scalar |
| yaml | EventType | event.dart::Event::type event.dart::EventType::streamStart event.dart::EventType::streamEnd event.dart::EventType::documentStart event.dart::EventType::documentEnd event.dart::EventType::alias event.dart::EventType::scalar event.dart::EventType::sequenceStart event.dart::EventType::sequenceEnd event.dart::EventType::mappingStart event.dart::EventType::mappingEnd event.dart::EventType::values event.dart::Event::new::type event.dart::ScalarEvent::type |
This check can be disabled by tagging the PR with skip-leaking-check.
License Headers ✔️
// Copyright (c) 2026, the Dart project authors. Please see the AUTHORS file
// for details. All rights reserved. Use of this source code is governed by a
// BSD-style license that can be found in the LICENSE file.
| Files |
|---|
| no missing headers |
All source files should start with a license header.
Unrelated files missing license headers
| Files |
|---|
| pkgs/bazel_worker/benchmark/benchmark.dart |
| pkgs/coverage/lib/src/coverage_options.dart |
| pkgs/html/example/main.dart |
| pkgs/html/lib/dom.dart |
| pkgs/html/lib/dom_parsing.dart |
| pkgs/html/lib/html_escape.dart |
| pkgs/html/lib/parser.dart |
| pkgs/html/lib/src/constants.dart |
| pkgs/html/lib/src/encoding_parser.dart |
| pkgs/html/lib/src/html_input_stream.dart |
| pkgs/html/lib/src/list_proxy.dart |
| pkgs/html/lib/src/query_selector.dart |
| pkgs/html/lib/src/token.dart |
| pkgs/html/lib/src/tokenizer.dart |
| pkgs/html/lib/src/treebuilder.dart |
| pkgs/html/lib/src/utils.dart |
| pkgs/html/test/dom_test.dart |
| pkgs/html/test/parser_feature_test.dart |
| pkgs/html/test/parser_test.dart |
| pkgs/html/test/query_selector_test.dart |
| pkgs/html/test/selectors/level1_baseline_test.dart |
| pkgs/html/test/selectors/level1_lib.dart |
| pkgs/html/test/selectors/selectors.dart |
| pkgs/html/test/support.dart |
| pkgs/html/test/tokenizer_test.dart |
| pkgs/html/test/trie_test.dart |
| pkgs/html/tool/generate_trie.dart |
| pkgs/pubspec_parse/test/git_uri_test.dart |
| pkgs/watcher/test/custom_watcher_factory_test.dart |
This check can be disabled by tagging the PR with skip-license-check.
There was a problem hiding this comment.
Code Review
This pull request fixes a stack overflow in deepHashCode caused by infinite recursion on self-referential lists and optimizes cycle detection by replacing closure-based checks with a manual loop. It also adds a new benchmark and a regression test. The review feedback suggests using a more idiomatic for-in loop for readability and improving the benchmark's precision by using Stopwatch and moving object allocations outside the measurement loop.
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request fixes a stack overflow in deepHashCode by ensuring that recursive calls for Iterable types use the internal function that handles cycle detection. It also optimizes performance by replacing a closure-based check with a manual loop and adds a benchmark and regression tests for self-referential structures. A review comment suggests further optimizing performance by restricting cycle detection to collection types, avoiding unnecessary identity checks for scalar values.
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request fixes a stack overflow in deepHashCode caused by self-referential lists and optimizes cycle detection by replacing closure-based checks with manual loops. It also introduces a new benchmark for recursive structures and improves test coverage for YAML directives. Feedback focuses on further reducing closure allocations in map hashing, ensuring the new list hashing algorithm is robust and consistent, and addressing a restrictive assertion that could cause crashes with non-list iterables.
| return equality.hash(value.keys.map(deepHashCodeInner)) ^ | ||
| equality.hash(value.values.map(deepHashCodeInner)); |
There was a problem hiding this comment.
The PR description mentions optimizing cycle detection by avoiding closure allocations. However, the Map implementation still uses .map(deepHashCodeInner), which creates two MappedIterable instances and two closures per map level. To fully achieve the optimization goal, consider using a manual loop for maps as well, although this may require a manual implementation of unordered hash combining.
| } else if (value is List) { | ||
| var hash = 0; | ||
| for (var i = 0; i < value.length; i++) { | ||
| hash = 31 * hash + deepHashCodeInner(value[i]); | ||
| } | ||
| return hash; |
There was a problem hiding this comment.
The hash algorithm for List has changed from the one used by IterableEquality (typically Jenkins) to a simple polynomial hash (31 * hash + next). While this is correct for structural equality, it is inconsistent with the Map implementation which still uses package:collection's hash logic. Additionally, for very large lists, the lack of masking (e.g., & 0x3FFFFFFF) might lead to performance degradation as Dart integers transition to arbitrary precision. Consider using a more robust combining function or masking the result to stay within SMI range.
| assert( | ||
| this is! Iterable || this is List, 'Iterables must be Lists in YAML'); |
There was a problem hiding this comment.
This assert is quite restrictive for a utility that is used by deepEqualsMap, which is effectively public (via src/ usage in benchmarks and potentially other tools). If a user provides a Set or a custom Iterable to deepHashCode or deepEquals, it will crash in debug mode. While YAML only supports sequences (Lists), these equality utilities are often used more broadly. Consider relaxing this assert or handling other Iterable types gracefully by falling back to their default hashCode or recursing if appropriate.
natebosch
left a comment
There was a problem hiding this comment.
What is this isCollection stuff accomplishing?
| for (var parent in parents) { | ||
| if (identical(parent, value)) return -1; | ||
| } |
There was a problem hiding this comment.
This is not a new problem, but the new structure makes it more obvious and also makes it easy to fix. I'm pretty sure this will have hash collisions between collections which have recursive collections in identical positions, regardless of whether the specific recursive collections are the same ones.
I think we can reduce hash collisions if we use the index of the collection in parents.
| for (var parent in parents) { | |
| if (identical(parent, value)) return -1; | |
| } | |
| for (var i = 0; i < parents.length; i++) { | |
| if (identical(parents[i], value)) return i; | |
| } |
Replaces #2362 Replace a nested call to `deepHashCode` with `deepHashCodeInner` to fix a stack overflow with self referential iterables.
Pulled from #2362 and refactored to test for exactly 1 warning. Add a test utility to check for warnings emitted while parsing and use it in two test cases that emit warnings.
|
Opened #2365 with the important bit from this PR. I don't know if the benchmark which doesn't use any benchmark framework is worth landing or not. The |
More of an optimization - I'll just close this out. |
Pulled from #2362 and refactored to test for exactly 1 warning. Add a test utility to check for warnings emitted while parsing and use it in two test cases that emit warnings.