Economic Health

From Autocomplete to Autonomy: The Evolution of AI Coding

Story by Kahini Shah
09/16/2025

Of all the early applications of enterprise AI, code generation has been the fastest to scale. At Google, over a quarter of new code now comes from AI. Tools like Lovable and Cursor, the first and second fastest-growing SaaS applications in history, have already changed the rhythm of engineering work, allowing developers to spend less time on tasks like writing boilerplate code.

Coding tools are evolving from co-pilots that work synchronously alongside developers to autonomous agents that complete tasks asynchronously on their own. The ambitious goal is to give them a high-level problem, like migrating a legacy codebase or building a new feature from scratch, that they can complete from start to finish. Agents will break down the task into a plan, write the code, execute it, test and debug, and refine based on the results.

The scale of the shift

The estimated total market size for software developers in the U.S. is between $300 and $400 billion. Goldman Sachs estimates that more than 60 percent of software profit pools could shift toward agent-driven development by the end of the decade.

Going agent-first requires developers to change their habits and place more trust in the system. Agentic solutions like OpenHands and Claude Code shift the work out of the IDE or chat window into the terminal, Linear, and Github. Instead of watching code update line by line, developers interact asynchronously while the agent makes edits directly. It’s a fundamental change in the developer workflow. As a result, we need new human-agent platforms to help developers manage, understand, and correct code written by software agents.

This shift also creates new bottlenecks. As AI generates more code, the burden shifts to review. Code review is no longer just a formality; it becomes the trust gateway. Developers will need tools that can assess security, maintainability, and compliance with the same rigor as they produce code.

Applying agents to real work

Lovable is an early proof point. It reached $100 million in annual recurring revenue (ARR) in just eight months, surpassing Cursor, the previous record holder. Lovable allows users with varying skill sets to create websites. It shows the value of building practical, user-friendly applications on top of foundation models.

We think Lovable is just the start. We can imagine enterprise apps that let product managers prototype software directly in their company’s design systems and coding standards. For small and midsize businesses, agents could power all-in-one platforms to launch websites, run marketing campaigns, and manage operations. We can also imagine agentic workflow builders. Since any digital workflow can be represented as code, agents could act as the backend to make processes more resilient and fault-tolerant

Code modernization and maintenance is a pain point agents could help address. Decades-old systems written in languages like COBOL still run mission-critical functions in industries like banking and government, but few engineers want or even know how to maintain them. Agentic systems could help modernize these codebases. Just as LLMs can translate between human languages, they could translate between coding languages with the right training data. Similar challenges exist in proprietary languages. Large enterprise applications run on their own languages. Salesforce uses Apex. SAP uses ABAP. These platforms serve millions of users, but they require niche skills with limited developer interest. AI could let developers write in mainstream languages and automatically translate the code into proprietary ones.

Agents continue to gain capabilities

The pace of progress is dramatic. Improvements are captured by benchmarks like SWE-Bench, which tests coding agents’ ability to generate code to resolve GitHub issues, and LiveCodeBench, which expands beyond code generation to test capabilities like self-repair while ensuring test cases are unseen by the model during training.

Models continue to increase agentic capabilities. Increased pretraining leads to superior generalization across coding tasks. Reinforcement learning with verifiable rewards allows models to learn by checking their own outputs against objective (or verifiable) evaluations like unit tests or execution traces. With test-time compute scaling, the model could explore several solutions, compare results, and refine results during inference.

Despite this progress, real limits remain. Reliability is uneven. Agents struggle when faced with out-of-distribution inputs. Long-horizon planning and task decomposition remain significant challenges, they often falter on multi-step workflows. And reasoning across large repositories remains a weakness.

What’s more, acquiring more data remains one of the biggest needs. Interaction and trajectory data—things like user plans, edits to generated code, and the verified code that ultimately gets committed—are especially valuable. This data can be fed back into models through fine-tuning or reinforcement learning to steadily improve performance. Reinforcement learning environments add another layer by simulating diverse, real-world development challenges, helping agents become more robust. And while it remains difficult to scale, demand continues to grow for expert-labeled and human-generated datasets.

A different kind of leap

Decades ago, Jürgen Schmidhuber proposed the Gödel Machine, a hypothetical program that could rewrite its own code once it proves that the modification would improve performance. Projects like Sakana’s Darwin Gödel Machine and DeepMind’s AlphaEvolve are beginning to explore this idea, with early results showing that self-improvement can enhance agents’ capabilities and surface more efficient algorithms.

The spectrum of AI for coding coupled with human ingenuity is vast. On one end there is search, autocomplete, and speeding up simple routine work. At the far end lies the dream of AI software engineers and ML researchers capable of writing complex code and discovering new model architectures fully autonomously.

At Obvious, we’re betting on the founders bold enough to build in this continuously evolving landscape.

If you’re building in this space, get in touch. We’d love to speak with you.

Author

Kahini Shah

Kahini brings to Obvious product and software experience, and a passion for building data and AI-enabled solutions in healthcare, fintech, and enterprise.