Chucky: Vercel for Claude Code



TL;DR: We built a managed platform for deploying Claude Code as a production API. One command to deploy, JWT tokens with embedded budgets for per-user billing, and a "cloud brain, local hands" architecture where Claude reasons in the cloud but tools execute locally. No infrastructure to manage. No billing code to write.
The Runtime Gap Nobody's Talking About
Here's the weird state of AI infrastructure in 2025:
- Model providers (Anthropic, OpenAI) give you raw intelligence via APIs
- Frameworks (LangChain, CrewAI) help you build agent logic
- But where does a built agent actually run in production?
This is the "Runtime Gap." Claude Code runs in your terminal. Moving that to a multi-user production service requires containerization, sandboxing, state management, auth, and—the big one—billing.
We call this the "Stochastic Billing" problem. An agent enters a reasoning loop. It might burn 5,000 tokens ($0.08) or 500,000 tokens ($8.00) depending on the task. You can't price your SaaS at $20/month if one power user can burn $50 of compute in a day.
Every developer forum, every GitHub issue, every Discord server we lurked in had the same complaint: "How do I bill my users for AI costs without losing money or building Stripe metering from scratch?"
So we built Chucky.
What It Actually Is
Chucky is the deployment layer for Claude Code. Think Vercel, but for AI agents instead of frontends.
npx @chucky.cloud/cli deploy
That's it. You get a production endpoint. Your workspace, your MCP servers, your tools—all deployed.
Then you generate tokens with embedded budgets:
const token = createToken({
userId: "user_123",
budget: { aiDollars: 5, window: "day" }
});
When that user hits their limit, the agent stops. No surprise bills. No metering code. No Stripe webhooks parsing token counts at 3am.
This is the feature nobody else has. We checked. Vercel AI SDK, LangChain, Modal, AWS Bedrock, OpenAI Assistants—none of them have built-in end-user billing. It's the single most requested feature across every platform, and everyone punts on it.
Cloud Brain, Local Hands
The architecture is what makes this interesting. Claude's reasoning happens in our cloud (secure, sandboxed, scalable). But tool execution happens wherever you want:
In the browser (for extensions and web apps):
const domTool = browserTool({
name: "click_element",
execute: async ({ selector }) => {
document.querySelector(selector)?.click();
}
});
On your server (for backend integrations):
const dbTool = tool({
name: "query",
execute: async ({ sql }) => db.query(sql)
});
On your machine (for CLI automation)—this is where it gets interesting.
Possession Mode
One command unlocks AI-powered automation for your entire workflow. Claude reasons in the cloud. Your machine executes.
$ chucky prompt "Find all TODO comments and create GitHub issues" \
--allow-possession
# Claude reasons in the cloud...
# Then executes on your machine:
→ HostGlob searching for TODO comments...
→ HostGrep found 12 TODOs across 8 files
→ HostBash gh issue create --title "Implement caching"...
→ HostBash gh issue create --title "Add error handling"...
...
✓ Created 12 GitHub issues from TODO comments
Six host tools give Claude controlled access to your local environment:
| Tool | What it does |
|---|---|
| HostBash | Run shell commands |
| HostRead | Read local files |
| HostWrite | Write local files |
| HostEdit | Edit files in place |
| HostGlob | Find files by pattern |
| HostGrep | Search file contents |
More examples:
# Fix TypeScript errors across your codebase
chucky prompt "Fix all TypeScript errors" --allow-possession
# Add to your CI pipeline
chucky prompt "Review changes in this PR"
No other platform lets cloud AI control your local machine. This is the future of AI-powered development.
Why This, Why Now
Claude holds 42% of the code generation market—more than double OpenAI. It's the first model that actually works well enough for autonomous coding tasks.
But there's no managed way to offer Claude Code capabilities to your users. You either:
- Build everything yourself (auth, billing, sandboxing, infra)
- Use a framework like LangChain and still build the infra
- Don't ship
We talked to agencies spending weeks on "Stripe hacks" for AI billing. Indie devs who built cool AI features but couldn't monetize them. Startups where 30% of engineering time went to "undifferentiated heavy lifting"—metering, auth, infra.
The window for this is ~12-18 months before Anthropic or Vercel potentially enters. We're moving fast.
Who This Is For
Agencies: Deploy AI agents for clients, set per-client budgets, get paid automatically. Transform project fees into recurring revenue.
Indie developers: Ship an AI product in an afternoon. The free tier doesn't require a credit card. If it works, you can monetize immediately.
Startup teams: Add Claude-powered features without hiring ML engineers or learning Kubernetes. The SDK works in browser, Node, and Python.
The Honest Assessment
What we're not:
- Multi-model. We're Claude-only for now. If you need GPT-4 or Gemini, look elsewhere.
- Battle-tested at massive scale. We're new. There will be bugs.
- A framework. We're opinionated and zero-config. If you want flexibility, use LangChain.
What we are:
- The fastest path from "Claude Code works on my machine" to "Claude Code works for my users"
- The only platform with built-in per-user billing (seriously, we checked everyone)
- Obsessed with developer experience in the Vercel/Stripe tradition
Pricing
| Tier | Price | Compute |
|---|---|---|
| Hobby | $5/mo | ~139 hours |
| Starter | $29/mo | ~972 hours |
| Pro | $99/mo | ~4,167 hours |
AI costs (tokens to Anthropic) pass through at cost. No markup. Overage is $0.04/hour.
The key feature: hard caps. Set a monthly maximum. When you hit it, we stop charging. No "Replit billing shock" stories.
Try It
npm install @chucky.cloud/sdk
npx @chucky.cloud/cli deploy
60 seconds to your first deployment. The free tier is BYOK (bring your own Anthropic key).
- Docs: docs.chucky.cloud
- GitHub: github.com/chucky-cloud
- Discord: discord.gg/chucky
We've been building this for 6 months. The billing problem is real—we've felt it ourselves on every AI project we've shipped. Happy to answer questions about architecture, pricing, or why we think the "Runtime Gap" is the next big infrastructure opportunity.