Leasey
Architecture

Reliable agents are mostly code

Agents that work in production are 80% deterministic code. The LLM is a small, scoped component — not the whole workflow.

All notes3 min read

The failure mode of most agent projects is handing too much of the workflow to the model. 'Let the agent figure it out' sounds good in a demo and falls apart in production, where the same edge cases recur and reliability matters more than cleverness.

Decompose the task as a graph first. Most nodes are deterministic code — API calls, validators, state updates. A few nodes are genuinely LLM-shaped: classification, routing, summarization, authoring. Keep the LLM scope narrow, and give each LLM node a typed output so downstream code can trust it.

Add guardrails where risk is high. Dry-run modes for destructive actions, human-in-the-loop approval gates on writes, scoped credentials so a bug can't exceed the user's actual permissions. Idempotent tool calls so retries are safe.

Trace every run end-to-end. Input, steps, tool calls, retries, final state, cost. When an agent misbehaves, you need to replay and diff against working runs — not guess from logs. Every failure becomes a test case that can't regress.

Ready to accelerate your tech growth?

Schedule your free consultation today and let's discuss how we can help your business scale efficiently.

Tech growth illustration
Ready when you are

Let’s ship your AI system.

Whether you’re scoping a new LLM product, hardening an existing one, or standing up the infra behind it — we’ll map the shortest path to production.

Email the teamOther ways to reach us