Skip to content

fix(yaml): Fix stack overflow in deepHashCode for self-referential lists#2362

Closed
kevmoo wants to merge 4 commits into
mainfrom
yaml_cleanup
Closed

fix(yaml): Fix stack overflow in deepHashCode for self-referential lists#2362
kevmoo wants to merge 4 commits into
mainfrom
yaml_cleanup

Conversation

@kevmoo
Copy link
Copy Markdown
Member

@kevmoo kevmoo commented Apr 5, 2026

  • Use deepHashCodeInner for iterables to preserve cycle detection state.
  • Optimize cycle detection in deepHashCode by using a manual loop instead of .any() to avoid closure allocations.
  • Add regression test for self-referential structures.
  • Add a new benchmark for recursive structures.

- Use deepHashCodeInner for iterables to preserve cycle detection state.
- Optimize cycle detection in deepHashCode by using a manual loop instead of .any() to avoid closure allocations.
- Add regression test for self-referential structures.
- Add a new benchmark for recursive structures.
@kevmoo kevmoo requested a review from a team as a code owner April 5, 2026 18:59
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 5, 2026

Package publishing

Package Version Status Publish tag (post-merge)
package:bazel_worker 1.1.5 already published at pub.dev
package:benchmark_harness 2.4.0 already published at pub.dev
package:boolean_selector 2.1.2 already published at pub.dev
package:browser_launcher 1.2.0-wip WIP (no publish necessary)
package:cli_config 0.2.1-wip WIP (no publish necessary)
package:cli_util 0.5.0-wip WIP (no publish necessary)
package:clock 1.1.3-wip WIP (no publish necessary)
package:code_builder 4.11.2-wip WIP (no publish necessary)
package:coverage 1.15.0 already published at pub.dev
package:csslib 1.0.2 already published at pub.dev
package:extension_discovery 2.1.0 already published at pub.dev
package:file 7.0.2-wip WIP (no publish necessary)
package:file_testing 3.1.0-wip WIP (no publish necessary)
package:glob 2.1.3 already published at pub.dev
package:graphs 2.4.0-wip WIP (no publish necessary)
package:html 0.15.7-wip WIP (no publish necessary)
package:io 1.1.0-wip WIP (no publish necessary)
package:json_rpc_2 4.1.0 already published at pub.dev
package:markdown 7.3.1 already published at pub.dev
package:mime 2.1.0-wip WIP (no publish necessary)
package:oauth2 2.0.5 already published at pub.dev
package:package_config 2.3.0-wip WIP (no publish necessary)
package:pool 1.5.2 already published at pub.dev
package:process 5.0.5 (error) pubspec version (5.0.5) and changelog (5.0.6-wip) don't agree
package:pub_semver 2.2.0 already published at pub.dev
package:pubspec_parse 1.6.0-wip WIP (no publish necessary)
package:source_map_stack_trace 2.1.3-wip WIP (no publish necessary)
package:source_maps 0.10.14-wip WIP (no publish necessary)
package:source_span 1.10.2 already published at pub.dev
package:sse 4.2.0-wip (error) pubspec version (4.2.0-wip) and changelog (4.2.0) don't agree
package:stack_trace 1.12.2-wip (error) pubspec version (1.12.2-wip) and changelog (1.12.2-dev) don't agree
package:stream_channel 2.1.4 already published at pub.dev
package:stream_transform 2.1.2-wip WIP (no publish necessary)
package:string_scanner 1.4.2-wip WIP (no publish necessary)
package:term_glyph 1.2.3-wip WIP (no publish necessary)
package:test_reflective_loader 0.6.0 ready to publish test_reflective_loader-v0.6.0
package:timing 1.0.2 already published at pub.dev
package:unified_analytics 8.0.14 already published at pub.dev
package:watcher 1.2.2-wip WIP (no publish necessary)
package:yaml 3.1.4-wip WIP (no publish necessary)
package:yaml_edit 2.2.4 already published at pub.dev

Documentation at https://github.com/dart-lang/ecosystem/wiki/Publishing-automation.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 5, 2026

PR Health

Changelog Entry ✔️
Package Changed Files

Changes to files need to be accounted for in their respective changelogs.

This check can be disabled by tagging the PR with skip-changelog-check.

Coverage ⚠️
File Coverage
pkgs/yaml/benchmark/recursive_benchmark.dart 💔 Not covered
pkgs/yaml/lib/src/equality.dart 💔 Not covered

This check for test coverage is informational (issues shown here will not fail the PR).

This check can be disabled by tagging the PR with skip-coverage-check.

Unused Dependencies ✔️
Package Status
yaml ✔️ All dependencies utilized correctly.

For details on how to fix these, see dependency_validator.

This check can be disabled by tagging the PR with skip-unused-dependencies-check.

Breaking changes ✔️
Package Change Current Version New Version Needed Version Looking good?
yaml None 3.1.3 3.1.4-wip 3.1.4-wip ✔️

This check can be disabled by tagging the PR with skip-breaking-check.

API leaks ⚠️

The following packages contain symbols visible in the public API, but not exported by the library. Export these symbols or remove them from your publicly visible API.

Package Leaked API symbol Leaking sources
yaml ErrorListener yaml/yaml.dart::loadYaml::errorListener
yaml/yaml.dart::loadYamlNode::errorListener
yaml/yaml.dart::loadYamlDocument::errorListener
yaml ScalarEvent yaml_node.dart::YamlScalar::internal::scalar
yaml EventType event.dart::Event::type
event.dart::EventType::streamStart
event.dart::EventType::streamEnd
event.dart::EventType::documentStart
event.dart::EventType::documentEnd
event.dart::EventType::alias
event.dart::EventType::scalar
event.dart::EventType::sequenceStart
event.dart::EventType::sequenceEnd
event.dart::EventType::mappingStart
event.dart::EventType::mappingEnd
event.dart::EventType::values
event.dart::Event::new::type
event.dart::ScalarEvent::type

This check can be disabled by tagging the PR with skip-leaking-check.

License Headers ✔️
// Copyright (c) 2026, the Dart project authors. Please see the AUTHORS file
// for details. All rights reserved. Use of this source code is governed by a
// BSD-style license that can be found in the LICENSE file.

Files
no missing headers

All source files should start with a license header.

Unrelated files missing license headers
Files
pkgs/bazel_worker/benchmark/benchmark.dart
pkgs/coverage/lib/src/coverage_options.dart
pkgs/html/example/main.dart
pkgs/html/lib/dom.dart
pkgs/html/lib/dom_parsing.dart
pkgs/html/lib/html_escape.dart
pkgs/html/lib/parser.dart
pkgs/html/lib/src/constants.dart
pkgs/html/lib/src/encoding_parser.dart
pkgs/html/lib/src/html_input_stream.dart
pkgs/html/lib/src/list_proxy.dart
pkgs/html/lib/src/query_selector.dart
pkgs/html/lib/src/token.dart
pkgs/html/lib/src/tokenizer.dart
pkgs/html/lib/src/treebuilder.dart
pkgs/html/lib/src/utils.dart
pkgs/html/test/dom_test.dart
pkgs/html/test/parser_feature_test.dart
pkgs/html/test/parser_test.dart
pkgs/html/test/query_selector_test.dart
pkgs/html/test/selectors/level1_baseline_test.dart
pkgs/html/test/selectors/level1_lib.dart
pkgs/html/test/selectors/selectors.dart
pkgs/html/test/support.dart
pkgs/html/test/tokenizer_test.dart
pkgs/html/test/trie_test.dart
pkgs/html/tool/generate_trie.dart
pkgs/pubspec_parse/test/git_uri_test.dart
pkgs/watcher/test/custom_watcher_factory_test.dart

This check can be disabled by tagging the PR with skip-license-check.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request fixes a stack overflow in deepHashCode caused by infinite recursion on self-referential lists and optimizes cycle detection by replacing closure-based checks with a manual loop. It also adds a new benchmark and a regression test. The review feedback suggests using a more idiomatic for-in loop for readability and improving the benchmark's precision by using Stopwatch and moving object allocations outside the measurement loop.

Comment thread pkgs/yaml/lib/src/equality.dart Outdated
Comment thread pkgs/yaml/benchmark/recursive_benchmark.dart Outdated
@kevmoo
Copy link
Copy Markdown
Member Author

kevmoo commented Apr 5, 2026

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request fixes a stack overflow in deepHashCode by ensuring that recursive calls for Iterable types use the internal function that handles cycle detection. It also optimizes performance by replacing a closure-based check with a manual loop and adds a benchmark and regression tests for self-referential structures. A review comment suggests further optimizing performance by restricting cycle detection to collection types, avoiding unnecessary identity checks for scalar values.

Comment thread pkgs/yaml/lib/src/equality.dart Outdated
@kevmoo kevmoo changed the title Fix stack overflow in deepHashCode for self-referential lists fix(yaml): Fix stack overflow in deepHashCode for self-referential lists Apr 5, 2026
@kevmoo
Copy link
Copy Markdown
Member Author

kevmoo commented Apr 5, 2026

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request fixes a stack overflow in deepHashCode caused by self-referential lists and optimizes cycle detection by replacing closure-based checks with manual loops. It also introduces a new benchmark for recursive structures and improves test coverage for YAML directives. Feedback focuses on further reducing closure allocations in map hashing, ensuring the new list hashing algorithm is robust and consistent, and addressing a restrictive assertion that could cause crashes with non-list iterables.

Comment on lines 126 to 127
return equality.hash(value.keys.map(deepHashCodeInner)) ^
equality.hash(value.values.map(deepHashCodeInner));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The PR description mentions optimizing cycle detection by avoiding closure allocations. However, the Map implementation still uses .map(deepHashCodeInner), which creates two MappedIterable instances and two closures per map level. To fully achieve the optimization goal, consider using a manual loop for maps as well, although this may require a manual implementation of unordered hash combining.

Comment on lines +128 to +133
} else if (value is List) {
var hash = 0;
for (var i = 0; i < value.length; i++) {
hash = 31 * hash + deepHashCodeInner(value[i]);
}
return hash;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The hash algorithm for List has changed from the one used by IterableEquality (typically Jenkins) to a simple polynomial hash (31 * hash + next). While this is correct for structural equality, it is inconsistent with the Map implementation which still uses package:collection's hash logic. Additionally, for very large lists, the lack of masking (e.g., & 0x3FFFFFFF) might lead to performance degradation as Dart integers transition to arbitrary precision. Consider using a more robust combining function or masking the result to stay within SMI range.

Comment on lines +151 to +152
assert(
this is! Iterable || this is List, 'Iterables must be Lists in YAML');
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This assert is quite restrictive for a utility that is used by deepEqualsMap, which is effectively public (via src/ usage in benchmarks and potentially other tools). If a user provides a Set or a custom Iterable to deepHashCode or deepEquals, it will crash in debug mode. While YAML only supports sequences (Lists), these equality utilities are often used more broadly. Consider relaxing this assert or handling other Iterable types gracefully by falling back to their default hashCode or recursing if appropriate.

Copy link
Copy Markdown
Member

@natebosch natebosch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this isCollection stuff accomplishing?

Comment on lines +117 to +119
for (var parent in parents) {
if (identical(parent, value)) return -1;
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a new problem, but the new structure makes it more obvious and also makes it easy to fix. I'm pretty sure this will have hash collisions between collections which have recursive collections in identical positions, regardless of whether the specific recursive collections are the same ones.

I think we can reduce hash collisions if we use the index of the collection in parents.

Suggested change
for (var parent in parents) {
if (identical(parent, value)) return -1;
}
for (var i = 0; i < parents.length; i++) {
if (identical(parents[i], value)) return i;
}

natebosch added a commit that referenced this pull request Apr 7, 2026
Replaces #2362

Replace a nested call to `deepHashCode` with `deepHashCodeInner` to fix
a stack overflow with self referential iterables.
natebosch added a commit that referenced this pull request Apr 7, 2026
Pulled from #2362 and refactored to test for exactly 1 warning.

Add a test utility to check for warnings emitted while parsing and use
it in two test cases that emit warnings.
@natebosch
Copy link
Copy Markdown
Member

Opened #2365 with the important bit from this PR.

I don't know if the benchmark which doesn't use any benchmark framework is worth landing or not.

The isCollection stuff looks unnecessary to me.

@kevmoo
Copy link
Copy Markdown
Member Author

kevmoo commented Apr 7, 2026

Opened #2365 with the important bit from this PR.

I don't know if the benchmark which doesn't use any benchmark framework is worth landing or not.

The isCollection stuff looks unnecessary to me.

More of an optimization - I'll just close this out.

@kevmoo kevmoo closed this Apr 7, 2026
@kevmoo kevmoo deleted the yaml_cleanup branch April 7, 2026 19:40
natebosch added a commit that referenced this pull request Apr 8, 2026
Pulled from #2362 and refactored to test for exactly 1 warning.

Add a test utility to check for warnings emitted while parsing and use
it in two test cases that emit warnings.
natebosch added a commit that referenced this pull request Apr 9, 2026
…sts (#2365)

Replaces #2362

Replace a nested call to `deepHashCode` with `deepHashCodeInner` to fix
a stack overflow with self referential iterables.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants