wip: Add OSS-Fuzz integration for ONNX#15382
Open
andife wants to merge 12 commits into
Open
Conversation
|
andife is integrating a new project: |
- fuzz_shape_inference: deserialize bytes to ModelProto before calling infer_shapes(), which does not accept raw bytes (was a no-op before) - fuzz_checker: remove redundant exception union (Exception subsumes all) - fuzz_parser: merge duplicate except branches into one - fuzz_model_loader: traverse graph nodes/inputs/outputs and run check_model() after load to exercise more code paths - fuzz_version_converter: replace random target version with model-aware selection (tries version-1, version+1, and latest opset) - build.sh: forward CFLAGS/CXXFLAGS and set -DONNX_USE_ASAN=ON via CMAKE_ARGS so C++ extensions are sanitizer-instrumented; fix copyright year to 2026; add make_seed_corpus.py invocation - project.yaml: add undefined (UBSan) sanitizer
- Add FuzzedDataProvider-based structured model construction - Generate subgraph-bearing ops (If/Loop/Scan) to exercise the recursive visitor - Sample strict_mode and check_type for broader API coverage - Add RecursionError guard to keep fuzzer running on known DoS - Move toggles to trailing byte so raw-bytes path preserves protobuf header - Update copyright year to 2026 per OSS-Fuzz linter Signed-off-by: MuhammedHussein17 <muhammedbussnies@gmail.com>
…store instrument_all Signed-off-by: MuhammedHussein17 <muhammedbussnies@gmail.com>
- Remove @atheris.instrument_func from TestOneInput: atheris.instrument_all() in main() already covers it; the decorator is redundant and inconsistent with every other ONNX fuzzer in this directory. - Raise sys.setrecursionlimit from 500 to 1000 (Python default): 500 was too conservative and risked spurious RecursionError suppression inside third-party code (Atheris internals, ONNX nanobind wrapper, protobuf runtime) that has nothing to do with the known shape_inference DoS. 1000 still guards against the Python-level recursion path while eliminating false-positive drops. - Update stale comment in _build_model that referenced the 500-frame limit.
…fuzzer Improve fuzz_shape_inference.py with structured fuzzing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds Python atheris fuzz targets for the ONNX library, covering the main parsing, validation, and transformation surfaces of the public API.
Fuzz targets
fuzz_checkerchecker.check_model(..., full_check=True)bytesnatively)fuzz_model_loaderload_model_from_string+ graph traversal +check_modelfuzz_parserparser.parse_modelfuzz_shape_inferenceshape_inference.infer_shapes(..., check_type=True)fuzz_version_converterversion_converter.convert_versionDesign notes
C++ extension instrumentation. ONNX's protobuf-based checker, shape inference engine, and version converter are implemented in C++. The build uses ONNX's own
-DONNX_USE_ASAN=ONcmake option (introduced in the ONNX build system for exactly this purpose) together with the OSS-Fuzz$CFLAGS/$CXXFLAGSenvironment variables, so both the Python layer (via atheris) and the C++ extensions are instrumented under ASAN and UBSan.Version converter seed corpus.
make_seed_corpus.pygenerates a small set of valid ONNX models with edge cases relevant to version conversion (missing inputs, mixed opset versions) so the fuzzer starts with structurally valid inputs rather than from empty.API contract differences.
checker.check_modelacceptsUnion[ModelProto, bytes, str, Path]and handles deserialization internally, so raw bytes are passed directly.shape_inference.infer_shapesacceptsUnion[ModelProto, str, Path]only, so the fuzzer deserializes first and passes the resultingModelProto.Testing
Built and ran locally with
python infra/helper.py build_fuzzers onnxandpython infra/helper.py run_fuzzer onnx <target>.