Question 1

What is LLM Workbench?

Accepted Answer

It turns every run of your LLM agent into a tamper-evident, model-agnostic, human-gated bundle: trace events, artifacts, gates, and cost — signed, exportable, and replayable. Instead of opaque API calls scattered across logs, each run becomes a self-contained record you own.

Question 2

How is this different from LangSmith, Langfuse, or Helicone?

Accepted Answer

Those are hosted observability dashboards — your telemetry lives in their database. LLM Workbench is protocol-first: each run is a self-contained, cryptographically signed bundle (with a sha256 integrity hash) you can export, verify, and replay anywhere. Human approval gates and run replay/fork are first-class, not add-ons.

Question 3

What's a "run bundle"?

Accepted Answer

One portable artifact capturing a whole run — the workflow, every trace event (model I/O, tool calls, gate decisions), the artifacts produced, the rule set, token usage and cost — plus an integrity hash so you can prove it wasn't altered.

Question 4

How do I add it to my code?

Accepted Answer

One import. Swap `generateText` for `tracedGenerateText` from `@llm-workbench/ai-sdk`, pass a session handle, and every call emits trace events, spans, artifacts, and cost automatically — your returned result is unchanged.

Question 5

Which models and providers does it support?

Accepted Answer

Model-agnostic — anything you call through the Vercel AI SDK (OpenAI, Anthropic, others). The bundle records provider/model per step, so one run can span multiple models with a single unified trace.

Question 6

What are "human gates"?

Accepted Answer

Policy-defined pause points (PAUSE_BEFORE, PAUSE_AFTER, CHECKPOINT) where a run halts for a human to approve, reject, or edit before continuing — and the decision is recorded in the bundle.

Question 7

Can I replay or fork a run?

Accepted Answer

Yes — the signed bundle lets you replay a run deterministically, or fork from any step to explore a different path, with full lineage tracked.

Question 8

Where does my data go? Is it private?

Accepted Answer

The public demo runs entirely in your browser — no account, no persistence. Authenticated runs persist to your own database, and because every run is an exportable bundle, you're never locked in.

Question 9

Is it open source? Is it a product?

Accepted Answer

Both. LLM Workbench is open source under the MIT license — the full source is on GitHub and the core libraries are on npm under the @llm-workbench scope. You can also use it instantly via the live demo and playground, no setup required.

Question 10

How do I try it?

Accepted Answer

Hit "View a demo run" at /runs/demo — no sign-up, it rotates through seeded agent runs. Sign in to open the playground and build your own.

Frequently asked questions