6097: Migrate from default Python serialization to orjson#6115
6097: Migrate from default Python serialization to orjson#6115navaro1 wants to merge 10 commits intoquantumlib:mainfrom
Conversation
5995ab9 to
36e9af7
Compare
|
Is the new output compatible with serialization produced with builtin json? Ie, after this PR will the serialization before and after orjson stay compatible? |
|
From Cirq sync: Doug: We are adding a new dependency to Cirq. Is orjson well maintained and have good compliance / review process? Especially since we are talking about binary formats. |
Based on the list of the features and drawbacks in https://github.com/ijl/orjson#orjson, it is stricter than the stdlib
I have seen that orjson has been used in Zulip for a while. The discussion comparing the alternatives can be found at zulip/zulip#6507. |
Almost always yes. |
36e9af7 to
70a066f
Compare
This commit is part of an exploratory migration from the default Python serialization system to the 'orjson' library. Changes include the addition of the 'orjson' import in the json_serialization.py module and subsequent use of the 'orjson' functions. In addition, 'orjson' has been added to the cirq-core/requirements.txt file.
The commit introduces the following changes: 1. An optional parameter 'enable_contextual_serialization' is added to 'assert_json_roundtrip_works' function in 'cirq/testing/json.py' and 'to_json' function in 'cirq/protocols/json_serialization.py'. This flag enables concise serialization of objects using a context map to prevent re-serialization of identical objects. 2. The 'to_json' and 'to_json_bytes' functions in 'cirq/protocols/json_serialization.py' now have an 'indent' parameter. The indentation is set to 2 spaces when the parameter is not None, improving the readability of the output JSON. 3. Updated the function 'assert_json_roundtrip_works' in 'cirq/testing/json.py' to include the new 'enable_contextual_serialization' parameter and the 'indent' parameter in the 'cirq.protocols.to_json' call. 4. Refactored tests in 'cirq/protocols/json_serialization_test.py' to use the new 'enable_contextual_serialization' and 'indent' parameters. 5. Deprecated 'separators' parameter in 'to_json' function of 'cirq/protocols/json_serialization.py' as it has no effect. This enhancement improves the performance of JSON serialization by avoiding redundant serialization and increases readability by allowing pretty printing.
- Modified the default values of `indent` and `enable_contextual_serialization` in `to_json` method. By default, pretty-printing is now enabled and contextual serialization is now turned on. This will provide more consistency with current implementation. - Removed the `enable_contextual_serialization` parameter from the `assert_json_roundtrip_works` function, and updated all its occurrences. This refactoring simplifies the function signature and usage. - Updated `to_json` docstring to better explain the `indent` and `enable_contextual_serialization` parameters and their impact on performance. Also, clarified the return type of the function. - Updated the tests to reflect these changes. Specifically, removed the `enable_contextual_serialization` parameter in `assert_json_roundtrip_works` calls, and the indentation in `to_json` calls where default value is now used.
1. Updated default values and added new parameters for `to_json` and `to_json_gzip` methods. 2. Updated the serialization benchmark `SerializeLargeExpandedCircuits` to consider the impact of the `indent` and `enable_contextual_serialization` configurations.
70a066f to
ed0d4d1
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #6115 +/- ##
=======================================
Coverage 97.84% 97.84%
=======================================
Files 1110 1110
Lines 96696 96709 +13
=======================================
+ Hits 94612 94625 +13
Misses 2084 2084 ☔ View full report in Codecov by Sentry. 🚨 Try these New Features:
|
|
xref #6315 We now have a use case where serialization performance is becoming a bottleneck. @suyashdamle to check into whether adding the additional dependency to orjson is compliant with Google policy. |
qubitsand 100num_momentsfrom1.15±0.01sto147±1ms). This puts us in the ballpark figure mentioned here: Add benchmarks for transformer primitives and json serialization #5957 (comment)orjsononly supports none indentation or indentation of size 2 - sourceorjsondoes not support handlingseparators: Optional[Tuple[str, str]]- sourcePerformance differences (will be updated with consequent changes)
Before change (run with
indent=None, defaultjsonlibrary)After changes (
orjsonlibrary)