An agentic lab, run by its own agents

Future labs are run by agents. We’re building one.

We don’t study agentic organizations. We are one.

Kalmantic is a working organization of AI agents and three humans. The agents carry persistent memory and run on a schedule. They do the research, draft the papers, run the briefings, and ship the products.

See the products Read the research

jusCode is live/Five books on Amazon/12 agents, 3 humans

What we are

An agentic lab, run by its own agents.

A fleet of agents works alongside three humans, with persistent memory, scheduled heartbeats, and an inference endpoint we built ourselves so we are nobody’s tenant.

One engine, several bets. The lab compounds, and each product harvests from the same engine. We don’t study how agents work inside an organization. We are the organization. Every product on this site was built by the same agent fleet we write papers about.

Products we ship

Software built by the agent fleet, then certified to ship. jusCode is live today.

Open source

The benchmarks and tools underneath the work, published in the open on GitHub.

Writing

The research and the books that explain why any of it matters, with the receipts attached.

Products · What we ship

jusCode

Live

A certification for agentic engineers. A developer proves they can direct, correct, and check agent work, and walks away with a credential an employer believes.

$25 an attempt · Live with Upekkha portfolio companies

Go to jusCode

jusFactory

In build

Certify the engineer, then the agent, then the code. The end state is trusted code shipping by default instead of a human re-reading every change. jusCode is rung one. The rest is in build, in public.

The roadmap jusCode is the first step of

Go to jusFactory Read the thesis

jusInfer

Private beta

The inference endpoint we run ourselves, so we are nobody’s tenant. Fast, affordable tokens for agents and applications.

Powers the lab and everything we ship

Go to jusInfer

The thesis

The model is the easy part.

Model prices fell about 95% in eighteen months. The model is interchangeable. What is not interchangeable is everything the agent accumulates inside your organization: its memory, its judgment, its record of what it got right and what it got wrong. That is the harness, and it compounds every day an agent runs.

It is also where trust lives. A model cannot vouch for its own output. Something above it has to. That something is what we build.

Research · We publish what breaks

We measure what production AI actually gets wrong, not what it scores on synthetic evals. Two research areas, both open.

Inference

Why AI systems get expensive and slow as usage grows. We ship tools for it: PeakInfer catches cost and performance problems before production. PeakWeights compresses models without losing output quality.

PeakInferPeakWeights

Claw + Hermes agent harness

The layer above the model: an agent's memory, judgment, and accumulated context. We research what makes a harness compound and what makes it brittle.

ClawHermes

Underneath both, the benchmarks: legacy code, security tooling, regulated workflows. We test what production breaks on, not what it passes.

View all research View on GitHub

Writing · We think in public

Authority gets built in the open, over time.

RK on AI, the show. The Substack. Kalmantic Press, five books and counting. Each piece cites the last, and the labs and the chip makers check the work. That is how authority on AI deployment gets built: with the receipts attached.

Watch RK on AI Kalmantic Press

Books on Amazon

Peak Inference: Infra Economics of AI Inference
What Is Your NemoClaw Strategy?
How to Be an Agentic Operator
The Model Is the Easy Part: Harness Engineering
Agentic Enterprise

Built and run by our own agents.

The lab is open. See what the agents shipped, or read the work that explains why it matters.

See the products Meet the team