For engineering

Prompts deserve a deploy pipeline.

Treat prompts like the code they really are: version control, evals on every change, staged rollouts, one-click rollback, and per-prompt observability. Call any prompt by stable ID — swap models without redeploying.

<200ms
p95 fetch latency
Globally cached prompts
0
Redeploys to swap models
Change the model, not your code
100%
Of changes evaluated
Regression gates block bad ships
— What you get

Built for the way your team actually ships.

Versioning
Git-style branching for prompts.

Every edit is an immutable version. Diff, branch, merge, tag. Pin a version to production while you iterate on staging.

  • Immutable version history
  • Branches + tags + diffs
  • Per-environment pins
Evals
CI for prompts.

Define test cases once. Every change runs the suite. Regressions block the merge — same workflow as your code repo.

  • Pass/fail + scored evals
  • LLM-judge or programmatic
  • Block-on-regression gates
API + SDKs
Stable IDs. Typed clients.

Call any prompt with prompsy.run('my-prompt'). TypeScript and Python SDKs with full type-safety on variables.

  • TS, Python, Go SDKs
  • Streaming + structured output
  • Generated types per prompt
Flows
Multi-step chains, in production.

Conditional branches, retries with backoff, human approval gates, webhook fan-out. Versioned and observable end-to-end.

Observability
Per-prompt traces, costs, and errors.

Every run logged with model, latency, tokens, and output. Tail in real time, query historical, ship to Datadog or OTel.

Governance
BYOK, retention, and audit.

Bring your own keys. Per-prompt PII redaction. Configurable retention. Audit log of every read, write, and run.

How shipping works

From draft prompt to production endpoint — the same week.

01
Draft

Write the prompt in Prompsy. Define typed variables and structured output schema.

02
Eval

Add test cases. Run on every change. Set regression thresholds for ship gates.

03
Stage

Promote to staging via tag. Bakeoff models, dial latency vs. cost vs. quality.

04
Ship + watch

Pin the version to prod. Stream traces. Roll back in one click if anything dips.

We were managing prompts in YAML files in a monorepo. Every model change was a deploy. With Prompsy we ship a prompt update without touching our codebase — and we can prove it was tested.
D
Devon Ortiz
Staff Engineer, Linear
— Questions

The things people ask.

Can we self-host?+

Enterprise plans support BYOC (bring your own cloud) on AWS, GCP, or Azure. The runtime, eval workers, and database all run in your VPC.

How does rollback actually work?+

Every promoted version stays immutable. Rollback is a single API call (or a click) that swaps the prod pin to the previous version. New requests pick it up within seconds across our edge cache.

What about caching and rate limits?+

Built-in semantic + exact-match caching with TTL controls. Per-prompt and per-org rate limits. Costs are tracked per call and rolled up by team and prompt.

Do evals run on every PR?+

Yes. The Prompsy GitHub action runs your eval suite on every PR that touches a prompt and posts results inline. Configure ship gates for required pass rates.

Which models are supported?+

OpenAI, Anthropic, Google, Mistral, Together, Replicate, plus any OpenAI-compatible endpoint. Add your private model with a base URL and an API key.

Ready to give your team real prompt leverage?