Industrialising development in the age of agentic AI

The emergence of vibe coding in 2025 has revolutionized the world of software development, reducing by several orders of magnitude the time required to build functional prototypes of applications and services. An interface that previously took experienced developers days or even weeks to build can now be created in a matter of minutes. The demos are impressive, the POCs are multiplying, and the enthusiasm generated by these solutions is spreading across all sectors. Gartner predicts that 40% of enterprise applications will incorporate specialised AI agents by the end of 2026, up from less than 5% in 2025 [1].

However, as soon as the question of moving to production arises, it becomes difficult to build solutions that are truly sustainable over the long term. While the most technically mature organisations manage to industrialise these tools, the majority see their prototypes stagnate. In the absence of a structured framework, AI generates as many risks as it solves problems : security vulnerabilities, regulatory non-compliance, unmaintainable code, cost overruns. Prototypes accumulate without ever crossing the threshold to production, leaving teams with a multitude of artefacts of no real value.

The emergence of agentic development, characterised by agents capable of autonomously handling long and complex tasks, pushes further the boundaries of what can be automated. Whether from the major technological players of the agentic era or from scientific research, the various stakeholders in the sector continue to push the limits of the field. As recently as May, the Chinese company Z.ai succeeded in developing an open-source model capable of continuously resolving complex, multi-step development tasks while drastically reducing training costs. Their latest model, GLM-5, thanks to a reinforcement learning infrastructure, manages to maintain high performance on long-duration, high-complexity tasks [2].

The technical prowess of the models merely shifts the difficulty. The real question is no longer whether an agent can solve a complex task, but how organisations can benefit from it. Without an industrial framework, this transformation produces more technical debt than real value.

eleven has had the opportunity to accompany its clients very early in the integration of solutions such as Cursor or Lovable, or in the development of bespoke solutions. This early adoption has allowed us to test their concrete limitations, and above all to forge methods, safeguards, and convictions about what actually works in production conditions. In the space of a few months, the landscape has profoundly evolved : in 2025 we moved from an exploratory vibe coding where AI served mainly as an occasional assistant to the developer, to true agentic engineering, in which agents are granted increasing autonomy over entire sections of the production chain. This change in scale imposes a new requirement: what could remain within a prototyping logic must now be built on rigorous foundations in controlled environments, or risk seeing technical debt explode as autonomy increases. Our conviction rests on three pillars: a shared technical foundation to guarantee homogeneity, structured agent pipelines, and a vision covering the full lifecycle to extend AI beyond code generation. The real question is therefore no longer “Can AI accelerate development?”, but rather: “How do we transform this acceleration into a lasting, manageable competitive advantage at scale ?”

I. Before the code, the real issues : Costs, Compliance, Sustainability

1. Traceability put to the test by AI Agents

As soon as an AI agent intervenes in the development chain, new questions arise: who is responsible ? Is the action traceable ? Are the company’s security policies (SSI) being respected ? Are the decisions auditable ?

In the absence of governance defined from the project scoping phase, the organisation exposes itself to both regulatory risks (non-compliance with GDPR or the European AI Act) and reputational risks. Google’s 2025 DORA report thus establishes that AI adoption is correlated with an approximately 10% increase in code instability, and that 30% of developers declare having little or no confidence in AI-generated code [3].

2. Controlling the Costs of AI Agents

Beyond the occasional overruns of a poorly configured agent (infinite loops, excessive token consumption), it is the cost trajectory that is beginning to worry technical leadership given the pricing of the latest solutions available on the market. Multiplying licences at €100 per month per developer, adding up subscriptions for Cursor, Copilot, Claude Code and other specialised tools, rapidly leads to budgets that can become significant for certain organisations.

Controlling these costs requires a policy of allocation by role and level of criticality, consumption caps per project, and tracking of the value generated. According to Gartner, more than 40% of agentic AI projects will be abandoned before reaching production by 2027, precisely due to the costs and complexity inherent in large-scale deployment [4].

3. The disposable Code Trap

Without a common framework, each project reinvents its own architecture, conventions and dependencies. The consequences are multiple: a developer moving between projects must familiarise themselves with a new organisation, components are poorly reused from one project to another, every library update becomes time-consuming. Added to this are more complex maintenance, degraded code readability and a slowdown in development cycles. The technical debt thus accumulated is estimated to weigh more than $2.41 trillion per year on the American economy, according to a recent MIT Sloan study [5].

The challenge is therefore to build a coherent application estate, capable of outlasting the teams and tools that shaped it, while remaining sufficiently modular to evolve over time without requiring a complete overhaul.

II. What breaks when moving from POC to production

1. Reducing invisible technical debt

AI-generated code frequently produces functional results quickly, but it is not systematically readable, maintainable, or compliant with internal standards. This technical debt accumulates silently and manifests in three forms:

  • Architectural heterogeneity : without an imposed framework, each agent produces a different structure. Multiplying approaches makes code difficult to understand from one project to another.
  • “Black box” code : the code works, but no one understands why. Each modification becomes costly and risky.
  • Drift from standards : by default, AI does not know the naming conventions, the team’s patterns or the company’s internal libraries. It produces “generic” code that gradually departs from established rules.

2. Defining the right level of agent autonomy

Current AI agents no longer merely suggest code : they can modify files, execute commands, interact with APIs and deploy to production. The risk does not lie in this power itself, but in the fact that the permissions granted by default are intrinsically broad. Without explicit configuration, an agent often has write access across the entire repository, or even to sensitive environments. Precisely defining its rights therefore becomes the first security measure, around three questions :

  • What level of autonomy should be granted ? Creating a test file can be automated without risk. Modifying a public API or pushing code to production requires human validation. Each threshold must be defined according to the criticality of the action.
  • What scope of action should be permitted ? Which files, services and databases can the agent touch? Here again, the principle of least privilege applies: the agent only accesses what it needs.
  • How to ensure proper separation between agents ? Most mature development cycles rely on at least three separate environments: Development, Pre-Production and Production. An agent with overly broad access can, on a bad instruction or by error, cross the boundaries between these environments and, for example, modify production data when it should only have operated in development. The separation applied to agents must be at minimum aligned with what is already in place for environments. Any right granted beyond this scope must be the subject of an explicit decision, traced and time-limited.

III. Industrialisation levers : How to Scale Up?

1. A “Boilerplate” as foundation

The first lever is a shared technical foundation (or “boilerplate”): a preconfigured project template whose certain parts are locked and cannot be modified by the agent. This foundation imposes from the outset the architecture, conventions and dependencies.

The benefits are threefold:

  • Homogeneity: all projects share the same architecture and conventions, regardless of the AI tool used.
  • Efficiency: the agent focuses on business logic instead of reinventing everything for each project. Fewer tokens consumed, fewer errors.
  • Compliance: security standards and best practices are embedded from the start. Critical files are write-protected.

This boilerplate must be adapted to each company: simple and generic if the use cases are varied, more specialised if the business needs are well identified. It can also be progressively enriched, by deriving specialised branches that inherit from the shared foundation while embedding configurations and components specific to each project type.

2. Agentic pipelines adapted to the entire product lifecycle

Rather than a single chatbot that does everything, the industrial approach relies on pipelines of specialised agents: each agent plays a precise role, and automatic checkpoints (quality gates) are interspersed between steps to validate quality. These pipelines are deployed across each target environment (development, staging, production), ensuring consistent end-to-end coverage. Agents can now be accessed directly within a team: on Cursor, for example, they can be used in a shared workspace, enabling multi-agent collaboration between several members.

3. What cannot be measured cannot be industrialised

To maintain quality over time, a unified monitoring system is crucial. It can cover the following dimensions:

  • Time-to-deploy : measure the time elapsed between an idea and its deployment to production, and identify bottlenecks.
  • Resource consumption : track cost per project and compare the efficiency of the different agents used.
  • Continuous security : audit continuously, detect vulnerabilities and measure remediation time.
  • Deliverable quality : user satisfaction, bug rate, test coverage, compliance with standards.

Without an industrial framework, AI creates more debt than value

Moving from vibe coding to industrialised agentic engineering is the prerequisite for AI to become a genuine performance driver, rather than a new source of complexity. There is no magic recipe: the architecture and infrastructure must be designed according to the resources, ambitions and context specific to each company. The organisations that will benefit most from this revolution will invest in three complementary pillars: a shared technical foundation guaranteeing consistency of practices and component reusability, structured agent pipelines enabling end-to-end orchestration of their interventions, and a vision covering the entire software lifecycle so as to extend the value of AI well beyond the mere generation of code.

Beyond the tools, the true differentiator does not reside in the choice of one platform or another, but in the quality of the overall architecture and the rigour applied to the data that feeds the agents. Without this requirement, even the most powerful models will produce unusable results.

To remain competitive, organisations must now prepare and invest: developing internal competencies or seeking external support, adopting new capabilities and evolving their architecture.

Sources

[1] Gartner, Gartner Predicts 40% of Enterprise Apps Will Feature Task-Specific AI Agents by 2026, Up from Less Than 5% in 2025, press release, 26 August 2025

[2] Z.ai, GLM-5: From Vibe Coding to Agentic Engineering, arXiv preprint, 2026

[3] Google Cloud, 2025 DORA Report: State of AI-assisted Software Development, September 2025

[4] Gartner, Gartner Predicts Over 40% of Agentic AI Projects Will Be Canceled by End of 2027, press release, 25 June 2025

[5] K. Schelfaut and P. P. Shukla, How to Manage Tech Debt in the AI Era, MIT Sloan Management Review, 2025

From insight to action,From today to what’s next, Build the future with us
Contact