Table of contents
- Recap: emulating invariant testing in a stateless test
- Supporting the full Vat contract, the hand-rolled approach
- Putting it all together with modern halmos
The initial focus of halmos, our open source smart contract symbolic testing tool, has been to make SMT solving work for stateless property tests. These are a good starting point, and an easy transition from stateless fuzz tests, but they’re not the gold standard for smart contract testing. Indeed, the most requested feature by far was support for stateful invariant tests.
In this post, we go over the invariant testing emulation trick, how to improve it with new cheatcodes, and finally how to completely get rid of it in halmos v0.3.x.
Back in 2024, Antonio wrote about emulating stateful invariant testing in halmos using some clever tricks. The core idea was that within a single stateless test, you could chain together multiple calls, retaining state between them. Let’s review Antonio’s technique on the MiniVat example:
import "forge-std/Test.sol"; import {SymTest} from "halmos-cheatcodes/SymTest.sol"; import {MiniVat} from "src/MiniVat.sol"; contract MiniVatTest is Test, SymTest { MiniVat public minivat; function setUp() public { minivat = new MiniVat(); } function check_minivat_n_full_symbolic(bytes4[] memory selectors) public { for (uint256 i = 0; i < selectors.length; ++i) { assumeValidSelector(selectors[i]); assumeSuccessfulCall(address(minivat), calldataFor(selectors[i])); } assertEq( minivat.debt(), minivat.Art() * minivat.rate(), "The Fundamental Equation of DAI" ); } function calldataFor(bytes4 selector) internal view returns (bytes memory) { if (selector == minivat.init.selector) { return abi.encodeWithSelector(selector); } else if (selector == minivat.move.selector) { return abi.encodeWithSelector( selector, svm.createAddress("dst"), svm.createInt256("wad") ); } else if (selector == minivat.frob.selector) { return abi.encodeWithSelector(selector, svm.createInt256("dart")); } else if (selector == minivat.fold.selector) { return abi.encodeWithSelector(selector, svm.createInt256("delta")); } else { revert(); } } }
Let’s look at what happens in detail:
check_minivat_n_full_symbolic(bytes4[] memory selectors)
function, which it identifies as a stateless testMiniVatTest
, halmos deploys it and runs setUp(), retaining the resulting state as the initial test statecheck_minivat_n_full_symbolic
. Because it has bytes4[] memory selectors
as an argument, halmos will instantiate it as a concrete array of symbolic bytes4
valuescalldataFor
is boilerplate to make sure we generate valid symbolic data for each call in the call sequenceassumeValidSelector
is a way to express what target functions we’re interested inaddress(minivat)
(the target contract)assumeSuccessfulCall
rejects calls that revert, which is a way to prune paths during state‑space exploration.This process also contains the essential ingredients of invariant testing:
Running this test yields a result in 2.8s (note that in the original post it used to take 12.6s):
halmos --function check_minivat_n_full_symbolic [⠒] Compiling... No files changed, compilation skipped Running 1 tests for test/MiniVat.t.sol:MiniVatTest Counterexample: halmos_dart_int256_ce68fb9_01 = 0x400000000000000000000000000000000000000000 p_selectors[0]_bytes4_1801362_00 = 0x380dc3ca00000000000000000000000000000000000000000000000000000000 p_selectors[1]_bytes4_616eaf4_00 = 0xe1c7392a00000000000000000000000000000000000000000000000000000000 p_selectors_length_baece6c_00 = 0x02 [FAIL] check_minivat_n_full_symbolic(bytes4[]) (paths: 70, time: 2.75s, bounds: [selectors=[0, 1, 2]]) Symbolic test result: 0 passed; 1 failed; time: 2.77s
This works fine on MiniVat, which is a simplified version of the full Vat contract with just 4 target functions. MiniVat is intentionally simple; the real challenge is breaking the invariant for the full Vat
contract. The full Vat contract has 16 target functions, which means that if we want to support it we will have a lot of boilerplate to write, and the performance will take a big hit.
Below, we’ll show how to improve the hand-rolled version of the stateful testing framework, and finally how it can be simplified with the latest release of halmos.
Halmos now supports some convenient cheatcodes (see a16z/halmos-cheatcodes):
// Create arbitrary symbolic calldata for the given contract address, name, or interface name. // By default, view and pure functions are excluded. // An optional boolean flag can be set to include view and pure functions. function createCalldata( address contractAddress ) external pure returns (bytes memory); function createCalldata( string memory contractOrInterfaceName ) external pure returns (bytes memory); function createCalldata( string memory filename, string memory contractOrInterfaceName ) external pure returns (bytes memory);
This means we can completely get rid of the calldataFor
and assumeValidSelector
helpers!
The code for the full Vat test now looks like this:
import {Test, console} from "forge-std/Test.sol"; import {SymTest} from "halmos-cheatcodes/SymTest.sol"; import {Vat} from "src/Vat.sol"; contract VatTest is Test, SymTest { Vat public vat; bytes32 public ilk = "gems"; function setUp() public { vat = new Vat(); vat.init(ilk); } function check_vat_n_full_symbolic() public { uint256 depth = vm.envOr("INVARIANT_DEPTH", uint256(2)); console.log("generating call sequences with depth", depth); for (uint256 i = 0; i < depth; ++i) { assumeSuccessfulCall( address(vat), svm.createCalldata(address(vat)) // using the new cheatcode ); } assertEq( vat.debt(), vat.vice() + vat.Art(ilk) * vat.rate(ilk), "The Fundamental Equation of DAI" ); } function assumeSuccessfulCall(address target, bytes memory data) public { (bool success, ) = target.call(data); vm.assume(success); } }
This finds a counterexample in 1-2 min:
INVARIANT_DEPTH=2 halmos --function check_vat_n_full_symbolic --early-exit [⠒] Compiling... No files changed, compilation skipped Running 1 tests for test/Vat.t.sol:VatTest [console.log] generating call sequences with depth 2 Counterexample: p_dart_int256_45a168c_124 = 0x200000000000000000000000000000000000000000 p_dink_int256_0cd2e65_123 = 0x00 p_i_bytes32_ead0d7f_119 = 0x00 p_ilk_bytes32_718b270_81 = 0x00 p_rad_uint256_54a455e_46 = 0x8000000000000000000000000000000000000000000000000000000000000000 p_selectors[0]_bytes4_1eee7ca_00 = 0x00 p_selectors[1]_bytes4_018ac94_00 = 0x00 p_selectors_length_14577c1_00 = 0x02 p_u_address_b307c2b_44 = 0x00 p_u_address_b5d362f_120 = 0x00 p_v_address_2f08456_45 = 0x00 p_v_address_3bc206a_121 = 0x00 p_w_address_e6be6d6_122 = 0x00 [FAIL] check_vat_n_full_symbolic(bytes4[]) (paths: 455, time: 61.55s, bounds: [selectors=[0, 1, 2]]) Symbolic test result: 0 passed; 1 failed; time: 61.59s
Let’s take a moment to unpack what svm.createCalldata(address(vat))
is doing:
console.logBytes(data)
in assumeSuccessfulCall
to inspect the calldata bytes that are automatically generated.In AssumeSuccessfulCall, we do (bool success, ) = target.call(data);
and then vm.assume(success)
, which lets us ignore calls that reverted.
Because a revert undoes all changes, rejecting reverted calls let us skip paths that we know don’t affect state. But you can have other calls that don’t revert, and also don’t change state! Ideally we would also want to ignore these. Since v0.2.3 we support snapshot cheatcodes, so we can do something like this:
mapping(uint256 => bool) public visited; function check_vat_n_full_symbolic() public { uint256 depth = vm.envOr("INVARIANT_DEPTH", uint256(2)); console.log("generating call sequences with depth", depth); // record the initial state snapshot visited[vm.snapshotState()] = true; for (uint256 i = 0; i < depth; ++i) { assumeSuccessfulCall( address(vat), svm.createCalldata(address(vat)) ); uint256 snapshot = vm.snapshotState(); // skip if the snapshot has already been visited vm.assume(!visited[snapshot]); // keep track of new snapshots visited[snapshot] = true; } assertEq( vat.debt(), vat.vice() + vat.Art(ilk) * vat.rate(ilk), "The Fundamental Equation of DAI" ); }
In halmos, snapshotState()
returns a hash of the network state (storage, balance, account code) which means we can compare snapshots. If the network state has not changed at all, then the snapshot should also be unchanged.
Unfortunately this does not result in an actual speedup because of the overhead of touching storage in Solidity, but the idea itself is sound. We just need some help from halmos itself. The last section covers how halmos v0.3 helps.
There are a few remaining problems with our current approach that limit its generality:
(address(this)
)We could address these issues by modifying assumeSuccessfulCall
:
function assumeSuccessfulCall(address target, bytes memory data) public { // allow any sender and value for this call address sender = svm.createAddress("sender"); uint256 value = svm.createUint256("value"); vm.deal(sender, value); // also allow any block timestamp to simulate calls in different blocks uint256 timestamp = svm.createUint256("timestamp"); vm.assume(timestamp >= block.timestamp); vm.warp(timestamp); vm.prank(sender); (bool success, ) = target.call{value: value}(data); vm.assume(success); }
With these new modifications, our more general but still hand-rolled stateful testing “framework” can now break the DAI invariant in 3 minutes:
halmos --function check_vat_n_full_symbolic --early-exit [⠒] Compiling... [⠃] Compiling 1 files with Solc 0.8.28 [⠊] Solc 0.8.28 finished in 776.31ms Compiler run successful! Running 1 tests for test/Vat.t.sol:VatTest [console.log] generating call sequences with depth 2 Counterexample: halmos_sender_address_80cde3f_47 = 0x7fa9385be102ac3eac297483dd6233d62b3e1496 halmos_sender_address_97bc225_96 = 0x7fa9385be102ac3eac297483dd6233d62b3e1496 halmos_timestamp_uint256_afd7084_49 = 0x8000000000000000000000000000000000000000000000000000000000000000 halmos_timestamp_uint256_e79e87f_98 = 0x8000000000000000000000000000000000000000000000000000000000000000 halmos_value_uint256_1dc9b21_48 = 0x00 halmos_value_uint256_6e7e9ac_97 = 0x00 p_dart_int256_749817d_32 = 0x7fffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff p_dink_int256_a5650cc_31 = 0x00 p_i_bytes32_30b6237_27 = 0x00 p_i_bytes32_8de9b03_62 = 0x00 p_rate__int256_9900444_64 = 0x01 p_u_address_482aaa5_28 = 0x00 p_u_address_e8dc543_63 = 0x00 p_v_address_74d52aa_29 = 0x00 p_w_address_c739174_30 = 0x00 [FAIL] check_vat_n_full_symbolic() (paths: 1091, time: 179.83s, bounds: []) Symbolic test result: 0 passed; 1 failed; time: 179.89s
Up until the release of halmos v0.3.0, this was likely the best way to emulate stateful invariant testing.
Since halmos v0.3.0 supports invariant testing out of the box, we can rewrite the test as follows:
import "forge-std/Test.sol"; import {Vat} from "../src/Vat.sol"; /// @custom:halmos --early-exit --invariant-depth 2 contract VatTest is Test { Vat public vat; bytes32 ilk; function setUp() public { vat = new Vat(); ilk = "gems"; vat.init(ilk); } function invariant_dai() public view { assertEq( vat.debt(), vat.vice() + vat.Art(ilk) * vat.rate(ilk), "The Fundamental Equation of DAI" ); } }
All the helpers and the boilerplate are gone! Also note that there is no need for bounding or clamping. We just explore the state-space with completely unbounded values and let the symbolic engine discover constraints.
Under the hood, halmos implements all the features we previously had to roll out manually:
Vat
, because it was deployed during setUp()
Vat
)--invariant-depth
)To give it a try, upgrade to the latest release:
uv tool install --python 3.13 halmos
if you don’t already have halmos installeduv tool upgrade halmos
if you already have a uv install of halmosThen check out the invariant testing examples. To run the final example in this post:
git clone git@github.com:a16z/halmos.git halmos --root halmos/examples/invariants --function invariant_dai
***
Thanks to aviggiano and igorganich for providing feedback on early versions of this post.
***
The views expressed here are those of the individual AH Capital Management, L.L.C. (“a16z”) personnel quoted and are not the views of a16z or its affiliates. Certain information contained in here has been obtained from third-party sources, including from portfolio companies of funds managed by a16z. While taken from sources believed to be reliable, a16z has not independently verified such information and makes no representations about the current or enduring accuracy of the information or its appropriateness for a given situation. In addition, this content may include third-party advertisements; a16z has not reviewed such advertisements and does not endorse any advertising content contained therein.
This content is provided for informational purposes only, and should not be relied upon as legal, business, investment, or tax advice. You should consult your own advisers as to those matters. References to any securities or digital assets are for illustrative purposes only, and do not constitute an investment recommendation or offer to provide investment advisory services. Furthermore, this content is not directed at nor intended for use by any investors or prospective investors, and may not under any circumstances be relied upon when making a decision to invest in any fund managed by a16z. (An offering to invest in an a16z fund will be made only by the private placement memorandum, subscription agreement, and other relevant documentation of any such fund and should be read in their entirety.) Any investments or portfolio companies mentioned, referred to, or described are not representative of all investments in vehicles managed by a16z, and there can be no assurance that the investments will be profitable or that other investments made in the future will have similar characteristics or results. A list of investments made by funds managed by Andreessen Horowitz (excluding investments for which the issuer has not provided permission for a16z to disclose publicly as well as unannounced investments in publicly traded digital assets) is available at https://a16z.com/investments/.
Charts and graphs provided within are for informational purposes solely and should not be relied upon when making any investment decision. Past performance is not indicative of future results. The content speaks only as of the date indicated. Any projections, estimates, forecasts, targets, prospects, and/or opinions expressed in these materials are subject to change without notice and may differ or be contrary to opinions expressed by others. Please see https://a16z.com/disclosures for additional important information.