The software industry has long operated on a set of relatively stable rules. Code is written by humans, the copyright belongs to the person who wrote it, and licenses determine how that code can be used and distributed. Open source license systems such as GPL, MIT, and Apache were all built on this very assumption. Human developers write code, other human developers read and modify it, and in that process, rights and responsibilities are clearly defined and passed along. This structure has supported the software ecosystem for decades, and the open source movement itself has grown on top of it.
However, in recent years, this assumption has begun to gradually break down. A new entity has entered the process of writing code: AI based on large language models. Tools like GitHub Copilot, Claude Code, and Cursor have already become deeply embedded in many developers’ workflows. Developers no longer always write code line by line themselves; instead, they describe functionality to AI and then modify or review the generated results. At first, this shift appeared to be simply a productivity boost. But over time, more fundamental questions have begun to emerge.
Who owns the code generated by AI? Does that code have copyright at all? If it does, does the right belong to the human developer, or does it belong to no one? And if AI generates code based on open source training data, should that output be subject to existing licenses? These questions are not merely technical debates—they challenge the entire framework of software copyright and open source licensing that we have long taken for granted.
This series begins with those questions. A recent controversy in the Python ecosystem—the AI-driven rewrite of the chardet library—demonstrates how real these issues have become. The project maintainers rewrote the existing code using AI and, in the process, changed the license from LGPL to MIT. On the surface, it may look like a simple rewrite. However, the case quickly sparked intense debate within the community. The core issue was whether AI-rewritten code is truly new code or a derivative work of the original, and whether such a process is legally permissible.
This controversy does not end with a single project. If AI-assisted rewriting allows licenses to be changed freely, the protective function of Copyleft licenses could be fundamentally weakened. On the other hand, if AI-generated code is considered to have no copyright at all, there may be situations where open source licenses cannot be applied in the first place. In that case, we could find ourselves entering a new kind of legal void that we have never experienced before.
In this series, we will follow these issues step by step. We will first examine how to determine the license of code rewritten using AI, and whether code generated from models trained on GPL-licensed code should also be subject to GPL. We will then explore whether the concept of clean-room implementation, a historically important legal principle in software, still holds in the age of AI, and whether AI-generated code can be recognized as a legal work at all. Finally, we will consider whether Copyleft licenses can retain their meaning in the AI era, and whether AI models themselves could become a new form of software supply chain.
This is an area where no clear conclusions yet exist. Court precedents are still limited, and the technology itself is evolving rapidly. But one thing is certain: we have already crossed the threshold into a new era. When the way code is written changes, the rules governing ownership and sharing of that code must inevitably change as well. And this transformation is happening quietly, yet unmistakably.
This series is an attempt to calmly trace those changes from within, as AI and open source intersect to create new questions. Perhaps these questions are not limited to developers alone. They are also about how the future software ecosystem will collaborate and under what rules knowledge will be shared.