AiGentsy Stack

Provable. Payable. Institutional. The family of institutional graphs for autonomous agent work. Compute savings, including ProofPack Reuse, are byproducts of getting settlement right.

What shipped today: the full public release of the AiGentsy Stack protocol layer. Twelve institutional primitives, four SDKs, fourteen framework integrations, and CUDA validated benchmark results backing every compute savings claim. Every number is reproducible. Every null finding is published alongside the wins.

The Thesis

AiGentsy Stack is the settlement protocol for agent work. Compute savings are a byproduct of getting settlement right, not the business.

That is not a slogan. It is what the architecture forces. If you build the primitives required to settle agent work correctly, mandate graph authorization, cryptographic proof of work, attested governance, policy driven compute, you end up with a stack that amortizes inference across agents for free. The compute wins are not the business. They are evidence the settlement primitives work.

What This Repo Contains

Over the last 48 hours we validated the full stack on CUDA, shipped the code public, and published every number with reproducible artifacts. Here is what exists today.

Twelve institutional primitives

01HoverStackCompute governance
02Mandate GraphWho was allowed to do what
03ProofPack / GEPProof of work, acceptance, downstream gating
04Coordination GraphMulti agent dependency
05Value Flow GraphAllocation and release conditions
06Trust ProfileAccumulated reliability
07Lineage GraphDescent and recursive accountability
08Offer/Intent GraphTransaction intent
09Consequence GraphAuthorized downstream state changes
10Capability/Resource GraphWhat agents can actually do
11Agreement GraphExplicit accepted commitments
12Organization GraphDurable multi agent organizations

Each primitive is a signed portable artifact. Offline verifiable. No blockchain required, no gas fees, and no token dependency.

About HoverStack

HoverStack is the compute governance layer above your inference runtime. It sits between agent decisions and model calls, signing attestations of every consequential compute decision — what got computed, what got reused under prior signed evidence, what got refused under policy, and what got deferred.

ProofPack Reuse is one HoverStack mechanism, benchmark validated. EconomicGate, NegativeComputePolicy, WorkflowExecutor, and Shape Memory Decay are additional mechanisms — implemented and unit tested, awaiting workload specific validation.

HoverStack runs alongside your inference stack. It does not replace vLLM, SGLang, or your model serving layer. It governs the decisions about what those layers do.

The rest of what ships

Four SDKs. Python client, JavaScript client, standalone offline verifier on PyPI as aigentsy-verify, and a LangGraph native package.

Fourteen framework integrations. LangChain, LangGraph, LlamaIndex, AutoGen, CrewAI, OpenAI Agents, Vercel AI, MCP server, JavaScript SDK, standalone verifier, LangSmith, Langfuse, n8n/Zapier/Make, and a marketplace adapter. Pick your framework, integrate in minutes.

Conformance surface. Portable test vectors and settlement conformance suite external implementers can run against their own implementations.

One command demo. python examples/hello_e2e.py runs full settlement against the production runtime. Zero manual setup. Zero API keys.

Fourteen Domain Templates

Beyond the universal protocol, AiGentsy ships kit demos for specific verticals.

State change kits demonstrate settlement that triggers downstream action: aerospace mission critical authorization, robotics assembly verification, build/deploy gating, video generation pipelines, and agent to agent handoffs.

Payment kits demonstrate settlement that triggers a single authorized payment: research summaries, code generation, document extraction, financial analysis, medical admin, support resolution, procurement, compliance certificates, and data pipeline enrichment.

Each kit demonstrates the full lifecycle — proof, verification, acceptance, settlement — with evidence specific to that domain and the correct consequence type for that work. Same protocol, different consequences.

See them all: aigentsy.com/builders

The Compute Savings

On realistic agent workloads, CUDA validated on Qwen2.5-7B:

LayerResult
v1.3 paraphrased recall and governancePublished reference results
v1.4 proof bundle reuse across agents32.35% hit rate
v1.5 Wave 1 negative cache and pre approval40% refusal hit rate, 20/20 signed attestations
v1.5 Wave 2 mandate driven routing100/100 tier correctness, budget enforcement validated
v1.5 Wave 3 delta savings curve35-47% savings on localized edits
v1.6 delta within reuse94.1% prefix alignment, PASS verdict
v1.7 ProofPack Reuse (100-agent GH200)~78% wall-clock reduction, ablation confirmed

All of this composes. Exact cache hit means no compute. Near miss means reduced compute. Cold means full compute, but with the negative cache and budget enforcement still doing their work. Every tier produces a valid attestation that external verifiers can check without trusting us.

ProofPack Reuse

When agents encounter inputs that prior attested work already covers, ProofPack Reuse identifies the match and reuses the prior signed result instead of running inference again. The reuse decision is itself signed and auditable, so buyers see exactly when reuse fired and against which prior ProofPack.

v1.7 multi-agent benchmark on GH200: 100 agents, Qwen2.5-7B, mixed_composition workload, 77.8% wall-clock reduction. Prior-Artifact Sufficiency / ProofPack Reuse drove the gain, with full compute dropping from 2,456 to 576 and 1,880 prior artifact zero-compute decisions. Ablation validated as the sole driver.

Separate A100 Negative Compute exact-reuse benchmarks showed approximately 59% wall-clock reduction at 100-agent scale: 59.3% on CUDA tensor workloads and 59.7% on Qwen2.5-7B local LLM inference.

These benchmarks are not the same benchmark. The GH200 result used the multi-agent mixed_composition harness; the A100 results used the Negative Compute exact_reuse harness. They should be cited separately.

ProofPack Reuse reduces compute on overlapping multi-agent workloads in structural validation. Other v1.7 mechanisms (EconomicGate, NegativeComputePolicy, WorkflowExecutor, Shape Memory Decay) are implemented and unit tested but require different workload conditions for benchmark activation.

Why Settlement Is the Real Product

The compute savings are what most infrastructure companies would lead with. We won’t.

The real wedge is settlement for agent work, because settlement has two properties compute optimization does not.

First, when agents do consequential work, settlement is not optional. Somebody has to prove the work happened, validate it under policy, and authorize the next step, whether that next step is money moving or downstream state advancing. That is the gate any consequential agent work has to pass through. Operators deploy settlement infrastructure because they cannot run auditable agent work without it.

Second, settlement is a permanent strategic position. Compute optimization is a treadmill. Today ProofPack Reuse eliminates redundant compute on overlapping workloads; in 18 months vLLM and SGLang ship native improvements that compress some of that advantage. Settlement positions accumulate network effects, switching costs, and trust. Whoever becomes the default settlement layer for agent work owns that position for a decade.

AiGentsy’s compute savings are the dividend of settlement done right. Not the business of compute optimization. Everything we have validated on the compute side is evidence the settlement primitives are production-grade. That is the story.

What This Is Not

AiGentsy is not a model serving runtime. We do not replace vLLM, SGLang, or your inference infrastructure.

AiGentsy is not a payments processor. We use Stripe Connect for settlement; we do not move money ourselves.

AiGentsy is not a blockchain. We use RFC 6962 transparency logs anchored to external timestamp authorities. No tokens, no gas, no consensus latency.

AiGentsy is not an agent framework. We work with LangChain, LangGraph, AutoGen, CrewAI, OpenAI Agents, and others. We are the settlement layer underneath whatever framework you choose.

AiGentsy is not a marketplace. We provide the protocol that marketplaces build on. We are not running an agent marketplace ourselves.

What we are: the settlement protocol for autonomous agent work — the layer that packages proof, records acceptance, and authorizes the next consequence.

What Did Not Work

Three experiments did not clear validation

Shape clustering for coordinated multi agent batching. The theory was sound. Cluster similar shaped requests and batch them together for GPU efficiency. On CUDA, coordinated batching ran 2-3x slower than naive batching. Implementation overhead killed the theoretical gains. Ceiling reached. Mechanism parked.

Cold start cache rehydration across sessions. Loading prior session cache state into new sessions. Measured impact: -0.01%. The primitive exists in the code. The economics do not. Ceiling reached.

Verifier session snapshot cache wall-clock speedup. The mechanism fires correctly, 99/100 signature verifications successfully avoided via cached policy snapshots. Wall clock speedup on modern hardware: 0.97x. Ed25519 is already cheap enough that cache management overhead offsets the savings. The feature ships because it is structurally correct and pays off on constrained hardware, but we do not claim a general speedup.

We published these because we would rather have one fewer claim on the homepage and earn skeptic trust than stack every possible win and invite replication failures. If you are building infrastructure others will depend on, you publish your inconvenient data. Otherwise you are asking for faith, and faith is not a settlement primitive.

What Is Public and Reproducible Today

Everything we have claimed is in the repo. Every benchmark. Every conformance vector. Every CUDA result JSON. Every null finding. The code that produced the numbers. The verifier SDK that validates our proofs offline.

github.com/AiGentsyProtocol/aigentsy-protocol

Run the benchmarks yourself. Read the protocol specs with their evaluation pseudocode. Use examples/hello_e2e.py to settle work against our production runtime in one command. Verify our proof bundles without us in the loop.

This is how infrastructure becomes trusted. Not by announcements, but by download and verify.

What’s Next

We have zero external adopters today. That is the honest state. The validated stack matters, the compute numbers matter, the settlement primitives matter, but none of it matters at scale until agent builders run real work through us.

So here is what I am doing now. Looking for the first production deployment partner. Someone who builds agents, frameworks, agent products, autonomous workflows, anything where work needs to be proven and settled. We will do the integration work. We will sit alongside your team. We will handle the complexity.

If you run any kind of agent system where cost, auditability, governance, or state-change accountability is starting to bite, let’s talk. Direct email works: w@aigentsy.com. Or file an issue on the repo.

Autonomous agent work is scaling fast. The settlement layer that owns it will look a lot like this. We are looking for the first partner to prove it.

— Wade Founder, AiGentsy