Modern invariant testing with halmos

karmacomaDaejun Park

The initial focus of halmos, our open source smart contract symbolic testing tool, has been to make SMT solving work for stateless property tests. These are a good starting point, and an easy transition from stateless fuzz tests, but they’re not the gold standard for smart contract testing. Indeed, the most requested feature by far was support for stateful invariant tests.

In this post, we go over the invariant testing emulation trick, how to improve it with new cheatcodes, and finally how to completely get rid of it in halmos v0.3.x.

Recap: emulating invariant testing in a stateless test

Back in 2024, Antonio wrote about emulating stateful invariant testing in halmos using some clever tricks. The core idea was that within a single stateless test, you could chain together multiple calls, retaining state between them. Let’s review Antonio’s technique on the MiniVat example:

import "forge-std/Test.sol";
import {SymTest} from "halmos-cheatcodes/SymTest.sol";

import {MiniVat} from "src/MiniVat.sol";

contract MiniVatTest is Test, SymTest {
    MiniVat public minivat;

    function setUp() public {
        minivat = new MiniVat();
    }

    function check_minivat_n_full_symbolic(bytes4[] memory selectors) public {
        for (uint256 i = 0; i < selectors.length; ++i) {
            assumeValidSelector(selectors[i]);
            assumeSuccessfulCall(address(minivat), calldataFor(selectors[i]));
        }

        assertEq(
            minivat.debt(),
            minivat.Art() * minivat.rate(),
            "The Fundamental Equation of DAI"
        );
    }
 
    function calldataFor(bytes4 selector) internal view returns (bytes memory) {
        if (selector == minivat.init.selector) {
            return abi.encodeWithSelector(selector);
        } else if (selector == minivat.move.selector) {
            return
                abi.encodeWithSelector(
                    selector,
                    svm.createAddress("dst"),
                    svm.createInt256("wad")
                );
         } else if (selector == minivat.frob.selector) {
            return abi.encodeWithSelector(selector, svm.createInt256("dart"));
        } else if (selector == minivat.fold.selector) {
            return abi.encodeWithSelector(selector, svm.createInt256("delta"));
        } else {
            revert();
        }
    }
}

Let’s look at what happens in detail:

  1. halmos sees the check_minivat_n_full_symbolic(bytes4[] memory selectors) function, which it identifies as a stateless test
  2. because there is a test to run in MiniVatTest, halmos deploys it and runs setUp(), retaining the resulting state as the initial test state
  3. then it’s time to run the check_minivat_n_full_symbolic. Because it has bytes4[] memory selectors as an argument, halmos will instantiate it as a concrete array of symbolic bytes4 values
  4. the test then loops over the selectors array to generate a call sequence
  5. the call to calldataFor is boilerplate to make sure we generate valid symbolic data for each call in the call sequence
  6. the call to assumeValidSelector is a way to express what target functions we’re interested in
  7. the test then makes the calls to address(minivat) (the target contract)
  8. assumeSuccessfulCall rejects calls that revert, which is a way to prune paths during state‑space exploration.
  9. finally we can assert the protocol invariant

This process also contains the essential ingredients of invariant testing:

  • an initial test state
  • target contracts
  • target functions
  • call sequence generation
  • state-space exploration
  • invariant assertions

Running this test yields a result in 2.8s (note that in the original post it used to take 12.6s):

halmos --function check_minivat_n_full_symbolic
 [⠒] Compiling...
No files changed, compilation skipped

Running 1 tests for test/MiniVat.t.sol:MiniVatTest 
Counterexample: 
    halmos_dart_int256_ce68fb9_01 = 0x400000000000000000000000000000000000000000
    p_selectors[0]_bytes4_1801362_00 = 0x380dc3ca00000000000000000000000000000000000000000000000000000000
    p_selectors[1]_bytes4_616eaf4_00 = 0xe1c7392a00000000000000000000000000000000000000000000000000000000
    p_selectors_length_baece6c_00 = 0x02

[FAIL] check_minivat_n_full_symbolic(bytes4[]) (paths: 70, time: 2.75s, bounds: [selectors=[0, 1, 2]])
Symbolic test result: 0 passed; 1 failed; time: 2.77s

This works fine on MiniVat, which is a simplified version of the full Vat contract with just 4 target functions. MiniVat is intentionally simple; the real challenge is breaking the invariant for the full Vat contract. The full Vat contract has 16 target functions, which means that if we want to support it we will have a lot of boilerplate to write, and the performance will take a big hit.

Below, we’ll show how to improve the hand-rolled version of the stateful testing framework, and finally how it can be simplified with the latest release of halmos.

💻 if you want to follow along, the full code for the examples is available: https://github.com/0xkarmacoma/halmos-invariants-sandbox

Supporting the full Vat contract, the hand-rolled approach

Using svm.createCalldata(address)

Halmos now supports some convenient cheatcodes (see a16z/halmos-cheatcodes):

// Create arbitrary symbolic calldata for the given contract address, name, or interface name.
// By default, view and pure functions are excluded. 
// An optional boolean flag can be set to include view and pure functions.

function createCalldata(
    address contractAddress
) external pure returns (bytes memory);

function createCalldata(
    string memory contractOrInterfaceName
) external pure returns (bytes memory);

function createCalldata(
    string memory filename, 
    string memory contractOrInterfaceName
) external pure returns (bytes memory);

This means we can completely get rid of the calldataFor and assumeValidSelector helpers!

The code for the full Vat test now looks like this:

import {Test, console} from "forge-std/Test.sol";
import {SymTest} from "halmos-cheatcodes/SymTest.sol";
import {Vat} from "src/Vat.sol";

contract VatTest is Test, SymTest {
    Vat public vat;
    bytes32 public ilk = "gems";

    function setUp() public {
        vat = new Vat();
        vat.init(ilk);
    }

    function check_vat_n_full_symbolic() public {
        uint256  depth = vm.envOr("INVARIANT_DEPTH", uint256(2));
        console.log("generating call sequences with depth", depth);

        for (uint256 i = 0; i < depth; ++i) {
            assumeSuccessfulCall(
                address(vat), 
                svm.createCalldata(address(vat)) // using the new cheatcode
            );
        }

        assertEq(
            vat.debt(),
            vat.vice() + vat.Art(ilk) * vat.rate(ilk),
            "The Fundamental Equation of DAI"
        );
    }

    function assumeSuccessfulCall(address target, bytes memory data) public {
        (bool success, ) = target.call(data);
        vm.assume(success);
    }
}

This finds a counterexample in 1-2 min:

INVARIANT_DEPTH=2 halmos --function check_vat_n_full_symbolic --early-exit
[⠒] Compiling...
No files changed, compilation skipped

Running 1 tests for test/Vat.t.sol:VatTest
[console.log] generating call sequences with depth 2
Counterexample: 
    p_dart_int256_45a168c_124 = 0x200000000000000000000000000000000000000000
    p_dink_int256_0cd2e65_123 = 0x00
    p_i_bytes32_ead0d7f_119 = 0x00
    p_ilk_bytes32_718b270_81 = 0x00
    p_rad_uint256_54a455e_46 = 0x8000000000000000000000000000000000000000000000000000000000000000
    p_selectors[0]_bytes4_1eee7ca_00 = 0x00
    p_selectors[1]_bytes4_018ac94_00 = 0x00
    p_selectors_length_14577c1_00 = 0x02
    p_u_address_b307c2b_44 = 0x00
    p_u_address_b5d362f_120 = 0x00
    p_v_address_2f08456_45 = 0x00
    p_v_address_3bc206a_121 = 0x00
    p_w_address_e6be6d6_122 = 0x00

[FAIL] check_vat_n_full_symbolic(bytes4[]) (paths:  455, time:  61.55s, bounds: [selectors=[0, 1, 2]])
Symbolic test result:  0 passed;  1 failed; time:  61.59s

Let’s take a moment to unpack what svm.createCalldata(address(vat)) is doing:

  • it maps the address back to a contract name based on its deploy code
  • it looks up the ABI for the contract
  • it generates calldata with appropriate symbolic arguments for all public state-mutating functions as well as empty bytes and bytes for the fallback function
💡add console.logBytes(data) in assumeSuccessfulCall to inspect the calldata bytes that are automatically generated.null

Using vm.snapshotState()

In AssumeSuccessfulCall, we do (bool success, ) = target.call(data); and then vm.assume(success), which lets us ignore calls that reverted.

Because a revert undoes all changes, rejecting reverted calls let us skip paths that we know don’t affect state. But you can have other calls that don’t revert, and also don’t change state! Ideally we would also want to ignore these. Since v0.2.3 we support snapshot cheatcodes, so we can do something like this:

mapping(uint256 => bool) public visited;

function check_vat_n_full_symbolic() public {
    uint256 depth = vm.envOr("INVARIANT_DEPTH", uint256(2));
    console.log("generating call sequences with depth", depth);

    // record the initial state snapshot
    visited[vm.snapshotState()] = true;

    for (uint256 i =  0; i < depth; ++i) {
        assumeSuccessfulCall(
            address(vat), 
            svm.createCalldata(address(vat))
        );

        uint256 snapshot = vm.snapshotState();

        // skip if the snapshot has already been visited
        vm.assume(!visited[snapshot]);
    
        // keep track of new snapshots
        visited[snapshot] = true;
    }

    assertEq(
        vat.debt(),
        vat.vice() + vat.Art(ilk) * vat.rate(ilk),
        "The Fundamental Equation of DAI"
    );
}

In halmos, snapshotState() returns a hash of the network state (storage, balance, account code) which means we can compare snapshots. If the network state has not changed at all, then the snapshot should also be unchanged.

Unfortunately this does not result in an actual speedup because of the overhead of touching storage in Solidity, but the idea itself is sound. We just need some help from halmos itself. The last section covers how halmos v0.3 helps.

Enabling arbitrary timestamp, value and sender

There are a few remaining problems with our current approach that limit its generality:

  • the sender is fixed, every call comes from the test address (address(this))
  • the block timestamp never changes
  • we never send any value, so payable functions in the target contract are not properly explored

We could address these issues by modifying assumeSuccessfulCall:

function assumeSuccessfulCall(address target, bytes memory data) public {
    // allow any sender and value for this call
    address sender = svm.createAddress("sender");
    uint256 value = svm.createUint256("value");
    vm.deal(sender, value);

    // also allow any block timestamp to simulate calls in different blocks
    uint256 timestamp = svm.createUint256("timestamp");
    vm.assume(timestamp >= block.timestamp);
    vm.warp(timestamp);

    vm.prank(sender);
    (bool success, ) = target.call{value: value}(data);
    vm.assume(success);
}

With these new modifications, our more general but still hand-rolled stateful testing “framework” can now break the DAI invariant in 3 minutes:

halmos --function check_vat_n_full_symbolic --early-exit
 [⠒] Compiling...
 [⠃] Compiling 1 files with Solc 0.8.28
 [⠊] Solc 0.8.28 finished in 776.31ms
Compiler run successful!

Running 1 tests for test/Vat.t.sol:VatTest
[console.log] generating call sequences with depth 2
Counterexample: 
    halmos_sender_address_80cde3f_47 = 0x7fa9385be102ac3eac297483dd6233d62b3e1496
    halmos_sender_address_97bc225_96 = 0x7fa9385be102ac3eac297483dd6233d62b3e1496
    halmos_timestamp_uint256_afd7084_49 = 0x8000000000000000000000000000000000000000000000000000000000000000
    halmos_timestamp_uint256_e79e87f_98 = 0x8000000000000000000000000000000000000000000000000000000000000000
    halmos_value_uint256_1dc9b21_48 = 0x00
    halmos_value_uint256_6e7e9ac_97 = 0x00
    p_dart_int256_749817d_32 = 0x7fffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff
    p_dink_int256_a5650cc_31 = 0x00
    p_i_bytes32_30b6237_27 = 0x00
    p_i_bytes32_8de9b03_62 = 0x00
    p_rate__int256_9900444_64 = 0x01
    p_u_address_482aaa5_28 = 0x00
    p_u_address_e8dc543_63 = 0x00
    p_v_address_74d52aa_29 = 0x00
    p_w_address_c739174_30 = 0x00

[FAIL] check_vat_n_full_symbolic() (paths: 1091, time: 179.83s, bounds: [])
Symbolic test result: 0 passed; 1 failed; time: 179.89s

Up until the release of halmos v0.3.0, this was likely the best way to emulate stateful invariant testing.

Putting it all together with modern halmos

Since halmos v0.3.0 supports invariant testing out of the box, we can rewrite the test as follows:

import "forge-std/Test.sol";

import {Vat} from "../src/Vat.sol";

/// @custom:halmos --early-exit --invariant-depth 2
contract VatTest is Test {
    Vat public vat;
    bytes32 ilk;

    function setUp() public {
        vat = new Vat();
        ilk = "gems";

        vat.init(ilk);
    }

    function invariant_dai() public view {
        assertEq(
            vat.debt(),
            vat.vice() + vat.Art(ilk) * vat.rate(ilk),
            "The Fundamental Equation of DAI"
        );
    }
}

All the helpers and the boilerplate are gone! Also note that there is no need for bounding or clamping. We just explore the state-space with completely unbounded values and let the symbolic engine discover constraints.

Under the hood, halmos implements all the features we previously had to roll out manually:

  • target contract selection (in this case just Vat, because it was deployed during setUp()
  • target function selection (all public functions of Vat)
  • symbolic calldata generation for each target function
  • visited state snapshots
  • arbitrary senders, value and block timestamps
  • command line option to set the max call sequence length (--invariant-depth)

To give it a try, upgrade to the latest release:

  1. install uv
  2. run uv tool install --python 3.13 halmos if you don’t already have halmos installed
  3. run uv tool upgrade halmos if you already have a uv install of halmos

Then check out the invariant testing examples. To run the final example in this post:

git clone git@github.com:a16z/halmos.git
halmos --root halmos/examples/invariants --function invariant_dai

***

Thanks to aviggiano and igorganich for providing feedback on early versions of this post.

***

The views expressed here are those of the individual AH Capital Management, L.L.C. (“a16z”) personnel quoted and are not the views of a16z or its affiliates. Certain information contained in here has been obtained from third-party sources, including from portfolio companies of funds managed by a16z. While taken from sources believed to be reliable, a16z has not independently verified such information and makes no representations about the current or enduring accuracy of the information or its appropriateness for a given situation. In addition, this content may include third-party advertisements; a16z has not reviewed such advertisements and does not endorse any advertising content contained therein.

This content is provided for informational purposes only, and should not be relied upon as legal, business, investment, or tax advice. You should consult your own advisers as to those matters. References to any securities or digital assets are for illustrative purposes only, and do not constitute an investment recommendation or offer to provide investment advisory services. Furthermore, this content is not directed at nor intended for use by any investors or prospective investors, and may not under any circumstances be relied upon when making a decision to invest in any fund managed by a16z. (An offering to invest in an a16z fund will be made only by the private placement memorandum, subscription agreement, and other relevant documentation of any such fund and should be read in their entirety.) Any investments or portfolio companies mentioned, referred to, or described are not representative of all investments in vehicles managed by a16z, and there can be no assurance that the investments will be profitable or that other investments made in the future will have similar characteristics or results. A list of investments made by funds managed by Andreessen Horowitz (excluding investments for which the issuer has not provided permission for a16z to disclose publicly as well as unannounced investments in publicly traded digital assets) is available at https://a16z.com/investments/.

Charts and graphs provided within are for informational purposes solely and should not be relied upon when making any investment decision. Past performance is not indicative of future results. The content speaks only as of the date indicated. Any projections, estimates, forecasts, targets, prospects, and/or opinions expressed in these materials are subject to change without notice and may differ or be contrary to opinions expressed by others. Please see https://a16z.com/disclosures for additional important information.