Raindrop Workshop: The Local AI Agent Debugger That Hit 473 Stars
Open Source 6 min read

Raindrop Workshop: The Local AI Agent Debugger That Hit 473 Stars

Marcus Rivera
Marcus Rivera
May 18, 2026

Raindrop Workshop: The Local AI Agent Debugger That Hit 473 Stars

Anyone who has actually shipped an AI agent into production knows the dirty secret of the field: debugging an agent is mostly archaeology. You hand it a task, it runs for forty-five seconds, something goes sideways, and now you're scrolling through OpenTelemetry traces in some hosted dashboard trying to figure out which of seventeen tool calls produced the bad output. By the time you find it, you've lost the thread.

Raindrop, an AI observability startup, just open-sourced a tool that takes a different swing at this. It's called Workshop, it's MIT-licensed, it runs entirely on your laptop, and as of this writing the repo at raindrop-ai/workshop is sitting at 473 stars with the v0.1.6 release shipping on May 14, 2026.

The pitch is short: give your coding agent the power to write and run agent evals. Underneath that one-liner is a more interesting argument about where agent observability actually belongs.

What it does

Workshop is a local daemon plus a Vite-built web UI. You install it, point your agent at it, and every token, every tool call, every decision your agent makes streams into a dashboard at localhost:5899 in real time. No polling. No refreshing. No cloud account.

The technical shape is deliberately small:

  • A local SQLite database at ~/.raindrop/raindrop_workshop.db (managed via Drizzle ORM) holds every trace.
  • TypeScript on Bun runs the daemon — the repo is roughly 92.1% TypeScript with a thin shell installer.
  • A single port, configurable via RAINDROP_WORKSHOP_PORT, exposes both HTTP and WebSocket.
  • MIT license, no telemetry phone-home, no required account.

Install is one command:

curl -fsSL https://raindrop.sh/install | bash

Or, for the security-conscious:

git clone https://github.com/raindrop-ai/workshop.git
cd workshop
bun install
bun run dev

That's it. The daemon starts, the UI opens, and traces start flowing.

The interesting move: agents debug agents

Plenty of local trace viewers exist. Workshop's actual bet is different. Once your traces live in a SQLite file on disk, your coding agent — Claude Code, Cursor, Codex, Devin, or OpenCode — can read them directly. Not via an SDK. Not via a paid API. Just by opening the database.

That unlocks a workflow Raindrop calls the self-healing eval loop:

  1. Your agent does something wrong in production.
  2. The trace lands in your local Workshop DB.
  3. You run /instrument-agent inside Claude Code.
  4. Claude reads the trace, identifies the failing assertion, writes an eval that reproduces it, runs your agent against the eval, sees the failure, patches the code, and re-runs — until every assertion passes.

In other words: the eval isn't something you painstakingly hand-author after the fact. The eval is something your coding agent derives from the trace and then refuses to let you forget. The /setup-agent-replay command goes further, scaffolding an HTTP endpoint that replays a production trace against your local agent code, so a captured failure becomes a fixture you can keep re-running.

This is the inversion most agent shops haven't made yet. Observability data, in the cloud-SaaS model, is read by humans. Workshop assumes the primary reader is a coding agent.

Compatibility surface

Workshop covers an unusually broad set of stacks for a v0.1 project. From the README:

  • Languages: TypeScript, Python, Go, Rust
  • Agent SDKs: Vercel AI SDK, OpenAI Agents SDK, Anthropic SDK, Claude Agent SDK, LangChain, LangGraph, CrewAI, Mastra, Pydantic AI, DSPy, Google ADK, Strands, Agno, Deep Agents
  • Providers: AWS Bedrock, Azure OpenAI, Vertex AI
  • Coding agents: Claude Code, Codex, Devin, Cursor, OpenCode

The provider list matters because it covers the three places enterprises actually run their inference. The SDK list matters because it includes every framework that's currently fighting for the AI-agent default. Workshop is taking no side in the framework war, which is the right call for a piece of infrastructure.

How it compares to what was already out there

Tool Hosted? License Primary consumer of traces
LangSmith Cloud-hosted Proprietary Humans (dashboard)
Langfuse Self-host or cloud MIT Humans (dashboard)
Arize Phoenix Self-host or cloud Elastic License 2.0 Humans (dashboard)
Raindrop Workshop Local only MIT Coding agent (via SQLite)

Langfuse and Phoenix already cover the "self-hostable, open-source agent observability" niche. What Workshop adds is the explicit assumption that the consumer of your traces is another agent, plus the practical decision to skip the cloud tier entirely. There's no signup, no upgrade prompt, no rate-limited free plan. That's a clean fit with the way coding agents have started touching the local filesystem.

What's missing

Workshop is v0.1.6. The repo has 21 commits at the time of writing and a single primary contributor visible on the contributors page. That's worth keeping in mind before you bet a production observability pipeline on it.

A few practical gaps:

  • No multi-user view — by design. Workshop is a single-developer tool. If your team needs shared trace storage, Raindrop's hosted product is the obvious upgrade path.
  • No alerting or anomaly detection. You're scanning the UI or pointing your coding agent at the DB. That's the workflow.
  • Schema is young. A SQLite trace store is great for local hacking, but the schema will evolve fast. Anyone writing custom queries against raindrop_workshop.db should pin a release.

None of these are dealbreakers. They're the standard rough edges of a v0.x release that just dropped.

The strategic read

Raindrop is doing the same thing PostHog did with product analytics and Sentry did with error tracking: lead with a generous open-source local tool, then upsell to the hosted product when you outgrow it. The hosted Raindrop product handles team observability, retention, and dashboards. Workshop is the loss-leader that gets you in the door.

But the more interesting subtext is that Workshop is what agent observability looks like when you assume the user has a coding agent in front of them. The traces aren't decorative. They're inputs to the next coding step. That assumption is going to age very well.

The Bottom Line

If you're shipping an AI agent and you don't have a debugger in front of you, Raindrop Workshop is the lowest-friction option I've seen. One install command, no cloud account, MIT licensed, and crucially designed for your coding agent to read the traces as easily as you can. The 473 stars are early — this dropped four days ago — but the architecture is the right shape: local-first, framework-agnostic, and built around the assumption that your editor is now an agent too.

If you already use Claude Code or Cursor in your daily loop, install Workshop tonight and run /instrument-agent on whichever agent project is currently haunting you. The point isn't to replace LangSmith. The point is that your evals shouldn't need a human to write them anymore.