Speed Has Already Been Solved — Then Why Are Failures Increasing?

We have long been concerned with the problem of development productivity. The goal across the industry has been to write code faster and deliver more features with fewer people. In recent years, this problem seems almost solved. AI coding tools have dramatically increased the speed of code generation, reducing tasks that once took days to hours—or even minutes. Many organizations welcomed this shift, and individual developer productivity has visibly improved. On the surface, it appears that we have entered an era where we no longer need to worry about “speed.” But at that exact point, something unusual is happening. Speed has increased, yet systems are failing more frequently.

This paradox is not a coincidence. What we are witnessing is closer to a structural outcome. Since the adoption of AI, both the volume of code and the frequency of changes have increased rapidly, but system stability has not improved at the same pace. In fact, the opposite is happening—the frequency and impact of failures are growing. The key point here is that this cannot be explained simply by saying “there are more bugs.” The issue is not the quality of the code, but that the way code is created and deployed has fundamentally changed. In other words, while we can now produce code faster, our ability to control its impact on the system has not evolved at the same speed.

At this point, we need to ask an important question. If productivity increases, shouldn’t quality improve as well? With more time saved, more testing, and more reviews, systems should become more stable. Yet reality is moving in the opposite direction. Without understanding this contradiction, it is impossible to fully grasp the development environment after the introduction of AI. What we are facing is not a simple technical issue, but a structural problem caused by a broken balance in the development process. And this is not limited to a specific organization or incident—it is becoming increasingly visible across the entire industry.

In this context, recent incidents at Amazon should not be seen as isolated accidents, but as signals that compress and reveal the current transformation. They are not just problems of a single company, but examples of how software development is evolving after the adoption of AI. This article begins from that point—examining what remains as a problem after speed has been solved, and why systems are becoming more unstable.

The Incident Is Not Simple — Structural Signals from Amazon Outages

The two recent outages at Amazon carry far more implications than can be dismissed as mere “bad luck.” One incident involved a shopping service outage lasting approximately six hours, and another affected internal AWS tools for about thirteen hours. On the surface, both appear to be issues caused during software changes. However, internal analyses consistently mention two key terms: “AI-assisted code changes” and “high blast radius.” These terms suggest a structural shift beyond simple bugs or mistakes.

The phrase “high blast radius” is particularly significant. Bugs have always existed, and deployment mistakes are nothing new. However, past incidents were usually limited in scope—affecting specific features or subsets of users. In contrast, these recent cases involved changes that propagated across entire systems. This indicates not just a decline in code quality, but a fundamental shift in how changes spread within systems.

This shift is deeply connected to the nature of AI coding tools. AI can generate code quickly, but understanding the broader system-level impact of that code remains a human responsibility. The problem lies in the separation of these capabilities. Code is generated rapidly, but the process of understanding and controlling its impact remains slow and limited. As this imbalance accumulates, small changes increasingly lead to disproportionately large consequences.

Another critical detail in these incidents is the internal acknowledgment that “new GenAI usage patterns are not yet fully validated.” This goes beyond a tool-level issue—it reflects how organizations are adopting new technologies. Previously, clear best practices and safety mechanisms existed. With AI, however, these standards are still emerging while adoption accelerates. As a result, organizations are producing code in new ways without having established systems to fully control it.

If we interpret these incidents as mere failures, we miss their deeper meaning. Instead, they should be read as signals of how development processes are changing after AI adoption. Code generation speed has increased, but the ability to control its system-wide impact has not kept up. The result is not just more failures, but larger failures. This naturally leads to the next question: how are organizations attempting to address this problem?

The Organization’s Instinctive Response — Senior Approval as a Safety Mechanism

In response to these challenges, Amazon’s approach is both intuitive and symbolic. They introduced a policy requiring prior approval from senior engineers for AI-assisted code changes made by junior and mid-level engineers. On the surface, this appears reasonable. It is natural to expect that experienced engineers can prevent risky changes through final review. Indeed, many organizations have historically relied on similar mechanisms, with code review serving as a primary tool for ensuring stability.

However, a deeper look reveals that this is less about improving quality and more about redefining responsibility. When issues arise from AI-generated code, it is not easy to determine who is accountable. AI itself bears no legal responsibility—it is merely a tool. Ultimately, organizations must assign responsibility back to humans, and this is where “senior approval” emerges as a mechanism. It is not so much a technical solution as it is a practical approach to risk management.

This policy also carries another implication: it is a deliberate attempt to slow things down. AI was introduced to accelerate development, yet to control the resulting risks, organizations are now reintroducing constraints that reduce speed. This creates an interesting paradox. Technology pushes for acceleration, but organizations cannot fully absorb that speed and instead reimpose human validation to regulate it. Inevitably, this introduces bottlenecks, with the burden often concentrated on senior engineers.

The key point here is not whether this measure solves the problem, but why such a measure becomes necessary in the first place. Organizations inherently resist uncertainty. In large-scale systems, even a small mistake can lead to massive consequences, so some form of control mechanism is always required. AI coding tools dramatically increase productivity, but they also introduce unpredictability. And organizations, unable to tolerate that unpredictability, revert to human-centered control structures.

Ultimately, this policy is not just about “strengthening reviews.” It is an attempt to restore balance after the disruption caused by AI adoption, and to constrain risk to a manageable level. However, whether this approach is a fundamental solution remains questionable. At this point, we are led to a deeper question: is the code review mechanism itself still valid in this new environment?

Misconceptions About Code Review — Reviews Do Not Guarantee Quality

Code review has long been regarded as one of the most important mechanisms for ensuring software quality. Many organizations have used reviews to catch bugs early, correct design issues, and maintain a consistent level of code quality across teams. Naturally, this led to the belief that “strengthening reviews improves quality.” Amazon’s decision to enforce senior approval is also based on this assumption. However, this premise is far more fragile than it appears. Code review is not fundamentally a mechanism that guarantees quality, but rather a process that reveals it.

For reviews to function effectively, one critical condition must be met: the reviewer must fully understand the context of the code. They need to know what problem the code is solving, what role it plays within the system, and how it interacts with the existing structure. Only then can meaningful feedback be provided. However, AI-generated code often fails to make this context explicit. While the code may appear to function plausibly on the surface, the reasoning behind its implementation and the assumptions it relies on often remain unclear. In such cases, the reviewer is not simply reading code—they are reconstructing its intent.

This process is more costly than it seems. When code quality is low or design consistency is lacking, understanding it becomes significantly more difficult. The saying “bad code takes longer to understand than to fix” exists for a reason. AI-generated code shares similar characteristics. It is produced quickly, but its internal intent is opaque. As a result, reviews become increasingly burdensome, and reviewers spend more time interpreting code than improving it. At this point, reviews no longer function as tools for improving quality—they become sources of bottlenecks.

Ultimately, the issue is not whether reviews are strengthened. It is whether the review mechanism itself is still suited to the current environment. Before the introduction of AI, there was a relative balance between the difficulty of writing code and understanding it. Today, code generation has become extremely easy, while the difficulty of understanding and validating it remains unchanged. In this imbalance, strengthening reviews is more likely to expose bottlenecks than to resolve the problem. And these bottlenecks naturally lead to the next stage—we must examine where bottlenecks are shifting within the development process as a whole.

The Shift of Bottlenecks — From Writing Code to Understanding Code

In traditional development processes, writing code was the most resource-intensive stage. Developers had to understand requirements, design solutions, and implement code themselves. This process required time and deep concentration. While reviews and testing were important, code writing dominated the overall workflow. As a result, most productivity improvements focused on this stage. The emergence of better languages, frameworks, and tools reflects this focus. AI coding tools, however, have pushed this optimization to an extreme.

The moment code writing became easy, the bottleneck began to shift elsewhere. The question is no longer “how to write code,” but “can we understand what this code does?” AI can generate code rapidly, but understanding that code remains a human responsibility. More importantly, this understanding has become more difficult than before. Human-written code tends to reflect consistent intent and style, but AI-generated code often mixes patterns and can be unnecessarily complex or overly generalized. This makes comprehension significantly harder.

This shift is transforming the role of developers. Previously, the key skill was the ability to write good code. Now, the more critical skill is the ability to quickly understand and evaluate code. However, this capability does not scale easily. While code generation speed can increase exponentially, human comprehension cannot. This imbalance creates a new bottleneck. As code accumulates faster, the burden of reviewing and understanding it grows even larger.

As a result, the development process loses its balance. On one side, code is generated explosively; on the other, validation and understanding struggle to keep up. In this state, strengthening reviews does not solve the fundamental issue. Instead, systems become increasingly unstable, and small mistakes are more likely to escalate into major failures. At this point, we are faced with a deeper question: why does increased productivity immediately translate into increased risk?

The Collision Between Speed and Control — Why Productivity Becomes Risk

In general, increased productivity is seen as a positive signal. If more work can be done in less time, organizations can grow faster and create more value. However, this principle does not always hold in software systems. Especially when technologies like AI coding tools dramatically accelerate productivity, the effect is not limited to faster development. Instead, it can disrupt the balance of the entire system. The reason lies in the fact that productivity and control operate at different speeds.

Faster code generation means faster changes to the system. More changes are introduced in shorter periods, and the system’s state evolves more rapidly. The problem is that the processes responsible for validating and controlling these changes do not keep up at the same pace. Testing, review, and monitoring systems are still largely human-driven and inherently limited in speed. As a result, systems are updated more quickly than they can be properly validated, allowing unverified changes to reach production. This leads to the accumulation of risk.

In such an environment, even small issues can escalate easily. Changes deployed without sufficient validation can impact systems in unexpected ways, and these effects can spread rapidly. In modern distributed systems, a single change can cascade across multiple components, triggering chain reactions. What matters here is not the quality of individual code, but how changes propagate through the system. AI can produce plausible code at a local level, but it does not account for these global effects. This gap is precisely where large-scale failures emerge.

Ultimately, increased productivity inherently carries risk. The faster we move and the more changes we introduce, the more uncertainty the system must absorb. If that uncertainty is not properly controlled, productivity itself becomes a threat to system stability. Amazon’s decision to reintroduce senior approval as a way to slow things down reflects this realization. However, simply reducing speed does not resolve the issue. We must now take the next step and examine how the limitations of AI itself connect to this problem.

A More Fundamental Problem — AI Does Not Understand “Why”

As discussed in the previous section, the imbalance between productivity and control is not merely a process issue. Beneath it lies a more fundamental limitation: AI can produce the “form” of code, but it does not understand the “reason” the code should exist. We often say that AI writes code well, but that statement is only true in a very limited sense. AI can generate code that appears plausible within a given context, but it cannot judge what that code means within the broader system. This difference is larger than it seems, and in real systems, it leads to decisive consequences.

Software is not simply a collection of functioning code. Every piece of code carries design intent, constraints, and relationships with the rest of the system. Even modifying a single function requires considering where it is called, how data flows through it, and what happens if it fails. However, AI produces locally optimal solutions without fully understanding this structural context. As a result, code may appear correct at the unit level but behave unexpectedly at the system level. This is precisely where the problem of “high blast radius” emerges.

What makes this issue more difficult is that humans do not intuitively recognize this limitation. AI-generated code is often clean, syntactically correct, and maintains a certain level of quality. Naturally, we conclude that “this is good enough.” But that judgment is based on surface-level completeness, not on system-level impact. As this misconception accumulates, more changes are introduced into systems without sufficient understanding. Ultimately, the problem is not the performance of AI itself, but the false confidence its output gives to humans.

At this point, we must accept an important fact. AI is a code generation tool, not a design reasoning tool. These two roles are fundamentally different. No matter how capable AI becomes at generating code, if it cannot understand design and context, its output can destabilize entire systems at any time. If we fail to acknowledge this limitation, we will continue to repeat the same problems. And any solution chosen while ignoring this limitation will inevitably create new problems.

The Wrong Approach — The Limits of Strengthening Validation

The response many organizations are currently adopting is relatively straightforward: increase reviews, add more approval steps, and enforce stricter validation. Amazon’s senior approval policy is a clear example of this approach. On the surface, this seems reasonable. If problems occur, add more checkpoints to filter them out. However, this approach does not address the root cause. Instead, it merely shifts the problem and concentrates the burden at specific points.

The biggest limitation of strengthened validation is that it targets the “result.” It attempts to identify issues in already generated code, leaving untouched the problems that occurred during the generation process itself. With AI-generated code, where intent and context are often unclear, it is extremely difficult to fully understand and validate the code during review. Reviewers end up examining only parts of the code within limited time, making it difficult to assess system-wide impact. In this situation, strengthening validation is more likely to provide psychological reassurance than to deliver real improvements in quality.

This approach also introduces new bottlenecks. If every change requires senior approval, the workload inevitably concentrates on senior engineers. They must review an increasing volume of code, leading to fatigue and making it harder to maintain consistent judgment quality. As a result, the system loses balance between “rapidly generated code” and “slowly reviewed code.” Despite increased productivity, overall development speed may stagnate or even become unstable.

More fundamentally, this approach misdefines the problem. The issue is not the outcome of the code, but the way the code is produced. Focusing solely on validation is closer to managing symptoms than addressing causes. While it may offer short-term relief, it introduces greater complexity and cost in the long term. We must change the question. Instead of asking, “How can we validate better?” we should ask, “How can we generate more safely?”

The Necessary Shift — Control at the Generation Stage

At this point, the direction becomes clear. If the core problem lies in the generation process, then the solution must also begin there. This means designing how code is produced, rather than merely validating what has already been created. This is not just a matter of using tools differently—it requires restructuring the entire development process. AI must no longer be treated as a simple productivity tool, but as a controllable component within a system.

The first step is to ensure that only understandable code is generated. The scope of AI-generated code should be constrained, changes should be kept small, and generation should occur within clearly defined contexts. This reduces AI’s freedom but increases predictability. At the same time, developers themselves must adapt. Instead of passively accepting AI output, they must actively review and understand it. This process—self-review—must become a fundamental step. It is not a superficial checklist, but a level of understanding where one can explain the intent and impact of the code.

This approach may sacrifice some productivity in the short term. However, it significantly improves system stability and predictability in the long run. What matters is not how quickly code is produced, but how safely it can be integrated into the system. In the AI era, development should no longer be defined as “producing more,” but as “producing in a controlled way.”

Ultimately, we return to the starting point. AI has undeniably transformed the speed of development. But if we accept that speed without adjustment, systems cannot bear its weight. What is needed is not to slow down, but to change how we manage speed itself. Only when this shift occurs can we truly say that we are using AI effectively.

Conclusion — The Bottleneck in the AI Era Is Not Code, but Judgment

If we follow the discussion so far, we arrive at a natural conclusion. We are no longer living in an era where code-writing speed is the primary concern. AI has already solved this problem to a significant extent, and that speed will continue to increase. However, a new problem has emerged in its place: the ability to understand code, assess its impact, and take responsibility for it. This ability is difficult to automate and even harder to scale in the short term. As a result, the bottleneck in the development process is shifting from code creation to judgment.

This shift is not merely a change in roles—it is redefining what development itself means. In the past, the key question was “how well can you write code?” Now, it has become “how well can you understand and choose?” AI can rapidly generate countless possibilities, but deciding which one is correct remains a human responsibility. And this decision is not limited to the code level—it directly affects the stability of the entire system. As a result, developers are no longer simply producers of code, but are moving toward roles that interpret meaning and integrate it into systems.

This change is uncomfortable. Because while machines have taken over speed, responsibility still remains with humans. Even if AI generates the code, the consequences of that code must ultimately be handled by people. As a result, organizations reintroduce structures of control and approval, and developers are required to perform more review and judgment. In this process, we come to realize an important truth: technology always increases productivity, but how that productivity is managed is ultimately up to humans. And if that choice is wrong, productivity immediately turns into risk.

Now let us return to the original question of this article. Why has speed increased, yet failures occur more frequently? The answer is simple. We have solved the problem of code generation, but left the problems of code understanding and judgment unresolved. These two are not independent; as one is solved, the importance of the other grows. As AI continues to advance, this imbalance will become more pronounced, and development processes must be redesigned accordingly.

At this point, the question is no longer whether we should use AI. We already are, and we inevitably will continue to do so. The real question is this: how will we control this speed, and by what standards will we make decisions on top of it? If we fail to answer this, we will continue to fail faster and on a larger scale. But if we can answer it, AI can become more than just a productivity tool—it can contribute to building stable systems.

Ultimately, development in the AI era is no longer a code-centric activity. It is about judgment, responsibility, and control. And at this point, the role of the developer does not disappear—it becomes clearer. Not someone who writes code, but someone who understands systems and makes decisions. That is the role developers will take on in the future.