
Which working habits will your teams keep, and which will you need to unlearn as language models get more capable? Early LLMs helped developers with trivial autocomplete. Then they began to write function bodies from docstrings. Now, with agentic systems and multi-agent workflows, we are being invited to change the very structure of work: from single requests and replies to continuous, delegated programmes that can run for hours or days.
From tab-completion to agent teams: a short, useful history
It helps to put the change into three simple phases. First came the assistive era: models as sophisticated autocompletion. Developers accepted suggestions and retained full control. Next, models began to take larger slices of work — generating entire function bodies, tests, or documentation from prompts. Pair-programming with tools like GitHub Copilot became a normal part of the flow.
Now we’re entering an agentic era where multiple agents can coordinate to execute a project. Anthropic’s engineering write-up Building a C compiler with Claude is a useful demonstration: models are not just answering single questions — they’re running persistent, structured processes. Open frameworks such as AutoGen are lowering the barrier for teams to experiment with multi-agent orchestration. The practical implication is clear: products no longer need to assume a single-turn interaction model.
What this means for product and engineering leaders
This evolution changes the constraints that shape product roadmaps and engineering practices. Consider three consequences:
- New mental models: Teams must move from thinking in discrete API calls to thinking in processes and stateful workflows — what I call programme-aware design. Agent workflows have state, retries, and long-running side effects; your UX, logging and error models must reflect that.
- Different success metrics: Traditional latency and precision remain important, but we also need to measure process-level metrics: orchestration success rate, state convergence, cost per run, and human-overrides frequency.
- Safety and governance: Delegating work to agents multiplies risk surfaces. You must design safe fallbacks, escalation paths and clear boundaries where human authority takes over.
Three practical changes to your ways of working
For leaders who want to convert the potential of agentic systems into reliable product outcomes, here are three concrete shifts to implement now.
1. Design for processes, not just prompts
Encourage product teams to map workflows as state machines. Break down tasks into checkpoints where humans must review, where the agent can retry, and where external systems are updated. This is not academic — it determines whether an agent can complete a purchase flow, a code-refactor, or a content-moderation pipeline without creating silent failures.
2. Create small, controlled pilot sandboxes
Run experiments that have clear exit conditions and cost limits. Use synthetic datasets and instrumentation so you can observe behaviour across many runs. The Anthropic example of building something as complex as a compiler is inspiring, but you should first try pilots that return tangible business value: automated triage, customer follow-up drafts, or repeatable internal ops tasks.
3. Measure and govern at the programme level
Introduce metrics such as time-to-convergence, frequency of human intervention, and unintended side-effects per run. Governance is not only about guardrails; it’s about observability. If you cannot explain what an agentic workflow did last night, you cannot trust it with customer-facing responsibilities.
Organisational changes that matter
Adopting agentic ways of working is as much about organisation as technology. Consider three organisational moves that accelerate adoption responsibly:
- Create cross-functional squads with orchestration expertise (product, engineering, SRE, safety) who can treat agents like a critical infra component.
- Invest in shared libraries and patterns for prompts, state management, and rollback procedures to avoid costly one-off implementations.
- Train people to monitor and interpret agent reasoning traces; this skill will be as important as reading CPU profiles.
Example: what real teams are doing
Large labs and startups are already building practical agent-driven systems. Anthropic’s engineering piece demonstrates scale and technique, while open frameworks like AutoGen show how to coordinate agents programmatically. Meanwhile, teams using tools such as GitHub Copilot are layering agentic orchestration on top of developer workflows to automate testing, refactoring and release-note generation. These are not futuristic experiments; they are templates for delivering measurable product outcomes.
Where leaders should put their energy
Focus on three priorities: build observability into agent workflows, create safe human-in-the-loop patterns, and fund repeatable pilots that demonstrate business value. Resist the temptation to hand over whole product areas to opaque automation before you can answer basic questions: what did the agent do, why did it decide that, and how do we recover?
We’ve seen similar inflection points in previous waves of technology: mobile, cloud, and e‑commerce each demanded new organisational practices. The arrival of agentic LLMs is another such inflection. The upside is real — greater leverage, speed and scale. The risk is also real: brittle automations, cloudy accountability and poorly governed behaviours.
If you are a product or engineering leader, treat this as a programme of change, not a feature to bolt on. Start with small, instrumented pilots; codify safe patterns; and reorganise around process-aware delivery. Do that, and you’ll move from being preserved by tooling to being empowered by it — and that is how bold ambitions become deliverable outcomes.
Leave a Reply