test: Huge portion of unit circuit tests, fuzz and prover-verifier tests for V2 Airbender#302
test: Huge portion of unit circuit tests, fuzz and prover-verifier tests for V2 Airbender#302Fitznik wants to merge 8 commits into
Conversation
## What ❔ <!-- What are the changes this PR brings about? --> <!-- Example: This PR adds a PR template to the repo. --> <!-- (For bigger PRs adding more context is appreciated) --> ## Why ❔ <!-- Why are these changes done? What goal do they contribute to? What are the principles behind them? --> <!-- The `Why` has to be clear to non-Matter Labs entities running their own ZK Chain --> <!-- Example: PR templates ensure PR reviewers, observers, and future iterators are in context about the evolution of repos. --> ## Is this a breaking change? - [ ] Yes - [ ] No ## Checklist <!-- Check your PR fulfills the following items. --> <!-- For draft PRs check the boxes as you complete them. --> - [ ] PR title corresponds to the body of PR (we generate changelog entries from PRs). - [ ] Tests for the changes have been added / updated. - [ ] Documentation comments have been added / updated. - [ ] Code has been formatted.
popzxc
left a comment
There was a problem hiding this comment.
Also CI if failing still -- please format the code
| #[test] | ||
| #[ignore = "heavy prove+verify; run with `cargo test -- --ignored`"] | ||
| #[serial_test::serial] | ||
| fn test_prover_pipeline_add() { |
There was a problem hiding this comment.
Nice. Since these tests mostly check chunks of instructions, maybe we can create test program compiled to RISC-V (with asm blocks)?
We already have a smoke e2e flow for proving and verification, moreover for both 80/100 bit security and cpu/gpu.
So in theory we can replace whatever we run there (dynamic fibonacci AFAIR) with this compiled set of tests, and it will get the best of two worlds: full test matrix & really good coverage.
WDYT?
There was a problem hiding this comment.
To make sure I understand: you mean take small program chunks already in this file (e2e_prove_verify.rs) and stitch them into a single RISC-V binary, then run that through the smoke matrix (80/100 × CPU/GPU) in place of fibonacci — right? Or you want to work around the compliance vector binary, what is in zksync-airbender/prover/src/tests/circuit/compliance_vectors.rs?
There was a problem hiding this comment.
you mean take small program chunks already in this file (e2e_prove_verify.rs) and stitch them into a single RISC-V binary, then run that through the smoke matrix (80/100 × CPU/GPU) in place of fibonacci
This is what I meant, yeah -- I think it will give us better coverage on multiple fronts, while reusing existing infra.
| /// Positive fuzz for ADD/SUB/ADDMOD/SUBMOD/MULMOD (Family 1 R-type). | ||
| #[test] | ||
| #[ignore = "unbounded fuzz loop; run explicitly with --ignored"] | ||
| fn fuzz_add_sub() { |
There was a problem hiding this comment.
Usually fuzzing is not shaped as a test, it's shaped as a dedicated runnable binary.
Right now we have a bunch of heavy tests that are marked ignored -- if someone will try to run cargo test --workspace --ignored -- --test-threads 1, fuzz tests will hang the process.
So I think a dedicated binary will be better here
There was a problem hiding this comment.
Also I'm not sure, but maybe something like AFL will be more efficient, since it's typically structural and kind find regressions better
| let outcome = std::panic::catch_unwind(std::panic::AssertUnwindSafe(|| { | ||
| with_csrrw_cs( | ||
| rd_reg, | ||
| rs1_reg, | ||
| UNSUPPORTED_CSR, | ||
| 0, | ||
| // not a non-determinism cycle, so the oracle returns 0 and the rd | ||
| // field is irrelevant - what we want to trip is the CSR table | ||
| // lookup, not value flow. | ||
| 0, | ||
| |mut cs| { | ||
| assert!( | ||
| !cs.is_satisfied(), | ||
| "CSRRW on an unsupported CSR must violate the SpecialCSRProperties \ | ||
| lookup constraint" | ||
| ); | ||
| }, | ||
| ) | ||
| })); | ||
|
|
||
| // The add-time debug constraint checker may reject this intentionally | ||
| // invalid case via a panic before is_satisfied() runs. | ||
| let _ = outcome; |
There was a problem hiding this comment.
So basically we are catching any panic and then dropping the result, meaning that we have no assertion in this test, no? Not sure if I understand the idea behind this test.
What ❔
Adds a test suite for the unrolled-circuits path (per-circuit, fuzzer, and end-to-end) and fixes the debug witness evaluator so is_satisfied doesn't panic on transient unresolved variables.
Debug witness graph fix ( for making test purposes)
In cs/src/cs/witness_placer/cs_debug_evaluator.rs and cs/src/cs/cs_reference.rs:1042:
Per-circuit tests (no prover, no verifier)
New prover/src/tests/circuit/ module. Each family's circuit goes through BasicAssembly::is_satisfied directly with fixed inputs. Shared helpers in prover/src/tests/circuit/mod.rs.
Fuzzer
prover/src/tests/circuit/fuzz.rs. Seeded StdRng, random operands fed through the same circuit harness. Tests: fuzz_add_sub, fuzz_slt, fuzz_shift_binop, fuzz_mul_div, fuzz_i_type, fuzz_u_type, fuzz_load_store_word, fuzz_load_store_subword, fuzz_csrrw, fuzz_jal, fuzz_jalr, fuzz_branches, and fuzz_negative_all_r_type for malformed encodings. Just random inputs to catch what the fixed vectors miss.
End-to-end prover + verifier
execution_utils/tests/e2e_prove_verify.rs — 47 tests, one per opcode. Each one assembles a tiny RV32IM program with lib_rv32_asm, transpiles + runs it, generates a proof, verifies it, and checks x10..x25. Covers: ADD/SUB/ADDI/LUI/AUIPC, SLT/SLTU/SLTI/SLTIU, all six branches, JAL/JALR, the shift+binop set, the M-extension set, all loads/stores, and CSRRW with the non-determinism CSR.