-
Notifications
You must be signed in to change notification settings - Fork 8
Design Principles
Note
This page lists the design principles partly summarized by AI and modified by @justinchuby.
Note
When designing and implementing features, The Flax philosophy is also a good resource and food for thought.
The system supports all valid ONNX models and even a subset of invalid models to enable loading and fixing them. This principle ensures comprehensive compatibility while providing tools for model repair.
The architecture prioritizes low memory footprint through memory-mapped external tensors and a unified interface for different tensor types. The system supports zero-copy operations without tensor size limitations, enabling efficient handling of large models.
The system employs protocol-based interfaces to define contracts between components while allowing flexible implementations. This approach enables type safety, extensibility, and performance optimization without requiring inheritance from specific base classes.
The IR maintains strict separation between the in-memory representation and serialization formats. The core IR module is kept protobuf-free, decoupling the representation from the serialization format.
The system enables safe concurrent iteration during graph mutations through specialized data structures like DoublyLinkedSet
. This allows creating multiple iterators on graphs while performing modifications.
The design carefully balances immutability for safety with efficient mutation capabilities. Core entities like Node have immutable structure but mutable membership, while Value objects are immutable with usage tracking.
The classes provide Pythonic APIs that still map intuitively to ONNX protobuf concepts. This principle ensures familiar usage patterns while maintaining conceptual alignment with the ONNX specification.
The system emphasizes performant graph manipulation and serialization/deserialization to protobuf, enabling efficient model transformations and analysis passes.
Design the representation without knowledge of a particular opset or operator (except for rare and selective cases, like Constant
). Treat the default opset and custom opsets similarly.
The representation should also not assume any particular backends, including the reference runtime, and be designed without any dependency of them.
The design principles are implemented through a layered architecture with clear separation between the public API layer, core IR layer, type system, infrastructure components, and serialization layer. The metadata system supports both serializable properties and transient analysis data, enabling flexible annotation of IR entities for various use cases.