AiGentsy Stack

Provable. Payable. Institutional. The family of institutional graphs for autonomous agent work. Compute savings, including ProofPack Reuse, are byproducts of getting settlement right.

AiGentsy turns agent work into portable proof that can be verified, accepted, and acted on outside the app.

What shipped today: the full public release of the AiGentsy Stack protocol layer. Twelve institutional primitives, four SDKs, fourteen framework integrations, portable governance evidence for governed-compute proof paths, a signed outcome receipt for recorded commercial outcomes, and CUDA benchmark results from controlled-workload validation of the compute primitives. Every number is reproducible. Every null finding is published alongside the wins.

The Thesis

AiGentsy Stack is the settlement protocol for agent work. Compute savings are a byproduct of getting settlement right, not the business.

That is not a slogan. It is what the architecture forces. If you build the primitives required to settle agent work correctly, mandate graph authorization, cryptographic proof of work, attested governance, policy driven compute, you end up with a stack that creates the conditions for compute reuse where real overlap exists. The compute wins are not the business. They are evidence the settlement primitives work.

The trust spine

Governed compute upstream. Governed consequence downstream.

HoverStack governs the compute. AiGentsy governs the commercial record that follows. Upstream, governed compute decisions are recorded as structured evidence, and for governed-compute paths, signed governance evidence can travel into the ProofPack. Downstream, lifecycle, acceptance, settlement, outcome receipts, and counterparty reliability all resolve against the same proof spine. One protocol, from compute decision to commercial consequence.

mandate → proof → verification → acceptance → lifecycle → settlement → reliability

Upstream, HoverStack emits governance evidence; downstream, the Acceptance Runtime gates whether agent and LLM outputs become consequence. The runtime is live at /acceptance-runtime/evaluate — it can accept, reject, retry, escalate, allow, block, or hold the downstream action, and exports an offline-verifiable evidence bundle.

AiGentsy Consequence Layer

The buyer-facing expression of this stack in motion. Five public functions, each anchored to a technical primitive:

  • Recall — reuse prior attested work, evidence shapes, policy paths, and decision templates where safe.
  • Accept — decide whether agent, LLM, or workflow output can become consequence.
  • Prove — package evidence into ProofPacks / signed records.
  • Verify — independently check the record through browser/CLI verification.
  • Settle — allow, block, hold, hand off, or record consequence.

Recall is the buyer-facing expression of HoverStack’s prior-attested-work reuse capability. HoverStack remains broader than Recall alone: compute governance, Decision Envelopes, negative compute, workflow execution, benchmark validation, and attestation paths.

Savings Trace is the presentation surface that makes each function visible — what the gate prevented, reused, shortened, escalated, or verified before consequence moved. View the trace →

Consequence Memory is the recorded trail of Recall, Accept, Prove, Verify, and Settle — what was attempted, what was allowed or held, what proof was produced, and what can be reused later. HoverStack already learns which prior paths, proof shapes, refusal patterns, evidence gaps, and decision envelopes are reusable; Consequence Memory makes that learning visible and connected to outcomes, rather than introducing a new learning layer.

Measured Compute Reuse

Benchmark-backed reduction from reusable proof paths and negative compute.

AiGentsy separates measured benchmark results from fixture-estimated demo savings. Recall surfaces prior proof paths, evidence shapes, and decision envelopes so repeated work can be referenced instead of recomputed where the benchmark class supports reuse.

  • A100 CUDA exact_reuse — wall-clock reductions of 58.3%–60.6% across 5 tested cohorts (10 to 250 agents), with mean 59.4% and harness-projected ceiling 59.6%.
  • Quality preservation: 1.0 on all cohorts.
  • Validation: MECHANISM_PASS on all 5 cohorts.
  • Material errors: 0 on all cohorts.
  • Evidence basis: aigentsy-ame-runtime/benchmark_results/negative_compute_cuda_a100_swarm_*/exact_reuse/.
  • GH200 v1.7 reference constant: 78% — code-backed in hoverstack/benchmark_data_readout.py:20; original run artifact not present in this repo.

Benchmark artifacts, not production customer savings. AiGentsy records and reuses verified paths; it does not improve model intelligence.

measured benchmark reference constant fixture-estimated verifier-backed potential exposure gated

The trust spine now starts at construction. Agents can be created with accountability built in: proof at handoff, acceptance before consequence, settlement when value moves.

Native AiGentsy agents are born into the stack: they can Recall, Accept, Prove, Verify, and Settle from the first scaffold — consequence-aware by default, settlement-capable by design.

And the spine now carries per-actor evidence end-to-end. Disputes, acceptances, and recorded outcomes can carry independent per-actor Ed25519 signatures — the acting actor signs their own event with their own non-custodial key. The bundle’s key_directory snapshots the public keys backing each signed event for offline verification. Per-actor signing is opt-in: present where actors enrolled keys, platform-attested where the platform recorded an action on behalf of a labelled actor, attribution-only otherwise. The offline verifier’s actor_signatures step validates each signature against its declared key with the actor/key binding enforced — each actor’s own signature, not a single platform signature standing in for everyone.

Now public in aigentsy v1.14.0:

pip install aigentsy==1.14.0
aigentsy create-agent my-agent --template settlement-native-mcp

This is the first reference adapter for the Settlement-Native Agent Core. It scaffolds an MCP-ready agent with the canonical prompt, lifecycle checkpoints, local demo, and tests — so proof, verification, acceptance, settlement, and export are present from line one. Local-only by default; no credentials needed to run the demo.

AiGentsy isn’t the agent factory — it’s the accountability standard agent factories can build in. PyPI v1.14.0 →

What This Repo Contains

Over the last 48 hours we validated the full stack on CUDA, shipped the code public, and published every number with reproducible artifacts. Here is what exists today.

Twelve institutional primitives

01HoverStackCompute governance
02Mandate GraphWho was allowed to do what
03ProofPack / GEPProof of work, governance evidence, and acceptance gating
04Coordination GraphMulti agent dependency
05Value Flow GraphAllocation and release conditions
06Trust ProfileAccumulated reliability
07Lineage GraphDescent and recursive accountability
08Offer/Intent GraphTransaction intent
09Consequence GraphAuthorized downstream state changes
10Capability/Resource GraphWhat agents can actually do
11Agreement GraphExplicit accepted commitments
12Organization GraphDurable multi agent organizations

Each primitive is a signed portable artifact. Offline verifiable. No blockchain required, no gas fees, and no token dependency.

Verification vs Acceptance

Verification proves the artifact held — cryptographic chain, event integrity, signatures, and references are valid. This is computational and binary: the signed bundle either holds or it does not.

Acceptance decides whether the work met the mandate — human readable judgment, policy gate, or principal authorization that says yes or no to the verified artifact. A bundle can verify perfectly and still be rejected because the work failed to meet what was authorized.

About HoverStack

HoverStack is the compute governance layer above your inference runtime. It sits between agent decisions and model calls, recording governed compute decisions and the evidence behind them. For governed-compute paths, HoverStack can emit a signed Governance Evidence Package, or GEP, that is embedded in the ProofPack as portable governance evidence.

The higher-level decision that led to that compute path is recorded through the Decision Envelope — the governed-compute decision record. Newly assembled governed-compute ProofPacks can surface a compact pointer to that upstream decision record, making the compute path legible downstream without duplicating the full ledger artifact.

ProofPack Reuse is one HoverStack mechanism, benchmark validated. EconomicGate, NegativeComputePolicy, WorkflowExecutor, and Shape Memory Decay are additional mechanisms — implemented and unit tested, awaiting workload specific validation. HoverStack runs alongside your inference stack. It does not replace vLLM, SGLang, or your model serving layer. It governs the decisions about what those layers do.

The rest of what ships

Four SDKs. Python client, JavaScript client, standalone offline verifier on PyPI as aigentsy-verify, and a LangGraph native package.

Fourteen framework integrations. LangChain, LangGraph, LlamaIndex, AutoGen, CrewAI, OpenAI Agents, Vercel AI, MCP server, JavaScript SDK, standalone verifier, LangSmith, Langfuse, n8n/Zapier/Make, and a marketplace adapter. Pick your framework, integrate in minutes.

Conformance surface. Portable test vectors and settlement conformance suite external implementers can run against their own implementations.

One command demo. python examples/hello_e2e.py runs full settlement against the production runtime. Zero manual setup. Zero API keys.

Fourteen Domain Templates

Beyond the universal protocol, AiGentsy ships kit demos for specific verticals.

State change kits demonstrate settlement that triggers downstream action: aerospace mission critical authorization, robotics assembly verification, build/deploy gating, video generation pipelines, and agent to agent handoffs.

Payment kits demonstrate settlement that triggers a single authorized payment: research summaries, code generation, document extraction, financial analysis, medical admin, support resolution, procurement, compliance certificates, and data pipeline enrichment.

Each kit demonstrates the full lifecycle — proof, verification, acceptance, settlement — with evidence specific to that domain and the correct consequence type for that work. Same protocol, different consequences.

See them all: aigentsy.com/builders

The Compute Savings

On realistic agent workloads, CUDA validated on Qwen2.5-7B:

LayerResult
v1.3 paraphrased recall and governancePublished reference results
v1.4 proof bundle reuse across agents32.35% hit rate
v1.5 Wave 1 negative cache and pre approval40% refusal hit rate, 20/20 signed attestations
v1.5 Wave 2 mandate driven routing100/100 tier correctness, budget enforcement validated
v1.5 Wave 3 delta savings curve35-47% savings on localized edits
v1.6 delta within reuse94.1% prefix alignment, PASS verdict
v1.7 ProofPack Reuse (100-agent GH200)~78% wall-clock reduction, ablation confirmed

All of this composes. Exact cache hit means no compute. Near miss means reduced compute. Cold means full compute, but with the negative cache and budget enforcement still doing their work. Every tier produces a valid attestation that external verifiers can check without trusting us.

ProofPack Reuse

When agents encounter inputs that prior attested work already covers, ProofPack Reuse identifies the match and reuses the prior signed result instead of running inference again. The reuse decision is itself signed and auditable, so buyers see exactly when reuse fired and against which prior ProofPack.

v1.7 multi-agent benchmark on GH200: 100 agents, Qwen2.5-7B, mixed_composition workload, 77.8% wall-clock reduction. Prior-Artifact Sufficiency / ProofPack Reuse drove the gain, with full compute dropping from 2,456 to 576 and 1,880 prior artifact zero-compute decisions. Ablation validated as the sole driver.

Separate A100 Negative Compute exact-reuse benchmarks showed approximately 59% wall-clock reduction at 100-agent scale: 59.3% on CUDA tensor workloads and 59.7% on Qwen2.5-7B local LLM inference.

These benchmarks are not the same benchmark. The GH200 result used the multi-agent mixed_composition harness; the A100 results used the Negative Compute exact_reuse harness. They should be cited separately.

ProofPack Reuse reduces compute on overlapping multi-agent workloads in structural validation. Other v1.7 mechanisms (EconomicGate, NegativeComputePolicy, WorkflowExecutor, Shape Memory Decay) are implemented and unit tested but require different workload conditions for benchmark activation.

Why Settlement Is the Real Product

The compute savings are what most infrastructure companies would lead with. We won’t.

The real wedge is settlement for agent work, because settlement has two properties compute optimization does not.

First, when agents do consequential work, settlement is not optional. Somebody has to prove the work happened, validate it under policy, and authorize the next step, whether that next step is money moving or downstream state advancing. That is the gate any consequential agent work has to pass through. Operators deploy settlement infrastructure because they cannot run auditable agent work without it.

Second, settlement is a permanent strategic position. Compute optimization is a treadmill. Today ProofPack Reuse eliminates redundant compute on overlapping workloads; in 18 months vLLM and SGLang ship native improvements that compress some of that advantage. Settlement positions accumulate network effects, switching costs, and trust. Whoever becomes the default settlement layer for agent work owns that position for a decade.

AiGentsy’s compute savings are the dividend of settlement done right. Not the business of compute optimization. Everything we have validated on the compute side is evidence the settlement primitives are production-grade. That is the story.

Signed outcome receipt

When a deal reaches a recorded outcome, AiGentsy emits a portable signed receipt of that outcome. The receipt names the buyer, the seller, the deal, the lifecycle states the work moved through, and the canonical OUTCOME_RECORDED event that closed it. Where available, it also carries the supporting transparency-log anchor for that recorded outcome.

The receipt can also record two governance facts about the recorded outcome: whether the underlying proof qualified as a Governed Commercial Proof — meaning signed governance evidence was cryptographically bound to that proof — and whether acceptance policy explicitly required that proof class. These two annotations are derived from independent canonical sources at different points in the deal lifecycle.

The receipt is designed to be handed to an enterprise counterparty, an auditor, or any third party that needs portable evidence that the protocol recorded a complete commercial outcome. It proves that AiGentsy signed and recorded the encoded protocol facts. It does not prove real-world work quality, legal enforceability, or off-platform truth.

The public route at GET /protocol/deals/{deal_id}/outcome-receipt returns the signed receipt for any deal that has reached OUTCOME_RECORDED, a 409 for deals still in progress, and a 404 for unknown deals.

Counterparty reliability

Counterparty reliability is driven by canonical recorded outcomes. When a deal reaches OUTCOME_RECORDED, the performing agent’s Trust Profile updates from that same protocol event.

Reliability is not a vanity metric, an engagement score, or a popularity rank. It moves only when a real commercial outcome is recorded through the protocol — and the signed receipt for that outcome remains portable.

What This Is Not

AiGentsy is not a model serving runtime. We do not replace vLLM, SGLang, or your inference infrastructure.

AiGentsy is not a payments processor. We use Stripe Connect for settlement; we do not move money ourselves.

AiGentsy is not a blockchain. We use RFC 6962 transparency logs anchored to external timestamp authorities. No tokens, no gas, no consensus latency.

AiGentsy is not an agent framework. We work with LangChain, LangGraph, AutoGen, CrewAI, OpenAI Agents, and others. We are the settlement layer underneath whatever framework you choose.

AiGentsy is not a marketplace. We provide the protocol that marketplaces build on. We are not running an agent marketplace ourselves.

What we are: the settlement protocol for autonomous agent work — the layer that packages proof, records acceptance, and authorizes the next consequence.

What Happens When Proof Fails

Tampered bundle. Verification returns FAILED with the specific failure mode identified — bundle hash mismatch, event chain break, signature invalid, or Merkle inclusion violation. Tampering is detectable and provable.

Missing acceptance. Settlement does not fire. Downstream consequence does not advance. Bundle exports as unaccepted with explicit acceptance status preserved.

Unsupported claim. The proof bundle may include evidence, but the mandate or acceptance rule does not authorize the claim. The result is UNCERTAIN or REJECTED with reasoning preserved in the bundle for audit.

Replay attempt. Settlement is exactly once through deal_id idempotency. Repeated attempts return the original result, not double execution.

Failure is not a bug — it is the protocol working correctly. Every failure mode produces auditable evidence that something did not pass. That signed failure is itself a verifiable artifact.

What Did Not Work

Three experiments did not clear validation

Shape clustering for coordinated multi agent batching. The theory was sound. Cluster similar shaped requests and batch them together for GPU efficiency. On CUDA, coordinated batching ran 2-3x slower than naive batching. Implementation overhead killed the theoretical gains. Ceiling reached. Mechanism parked.

Cold start cache rehydration across sessions. Loading prior session cache state into new sessions. Measured impact: -0.01%. The primitive exists in the code. The economics do not. Ceiling reached.

Verifier session snapshot cache wall-clock speedup. The mechanism fires correctly, 99/100 signature verifications successfully avoided via cached policy snapshots. Wall clock speedup on modern hardware: 0.97x. Ed25519 is already cheap enough that cache management overhead offsets the savings. The feature ships because it is structurally correct and pays off on constrained hardware, but we do not claim a general speedup.

We published these because we would rather have one fewer claim on the homepage and earn skeptic trust than stack every possible win and invite replication failures. If you are building infrastructure others will depend on, you publish your inconvenient data. Otherwise you are asking for faith, and faith is not a settlement primitive.

What Is Public and Reproducible Today

Everything we have claimed is in the repo. Every benchmark. Every conformance vector. Every CUDA result JSON. Every null finding. The code that produced the numbers. The verifier SDK that validates our proofs offline.

github.com/AiGentsyProtocol/aigentsy-protocol

Run the benchmarks yourself. Read the protocol specs with their evaluation pseudocode. Use examples/hello_e2e.py to settle work against our production runtime in one command. Verify our proof bundles without us in the loop.

This is how infrastructure becomes trusted. Not by announcements, but by download and verify.

What’s Next

We have zero external adopters today. That is the honest state. The validated stack matters, the compute numbers matter, the settlement primitives matter, but none of it matters at scale until agent builders run real work through us.

So here is what I am doing now. Looking for the first production deployment partner. Someone who builds agents, frameworks, agent products, autonomous workflows, anything where work needs to be proven and settled. We will do the integration work. We will sit alongside your team. We will handle the complexity.

If you run any kind of agent system where cost, auditability, governance, or state-change accountability is starting to bite, let’s talk. Direct email works: w@aigentsy.com. Or file an issue on the repo.

Autonomous agent work is scaling fast. The settlement layer that owns it will look a lot like this. We are looking for the first partner to prove it.

— Wade Founder, AiGentsy