Authored by – Gautam Mehra, a Student of Savitribai Phule Pune University
Abstract
This article reviews the new legal framework governing generative artificial intelligence (AI) and its overlap with copyright law. With generative AI systems increasingly producing content that has traditionally been shielded under intellectual property protection, they pose novel challenges to existing legal regimes. This discussion centers on recent case law, such as Getty Images, INC. v. Stability AI, INC., and Doe v. Github, INC., to demonstrate the complex legal issues. The article discusses the philosophical foundations of intellectual property rights and the technical realities of AI training methods and suggests regulatory solutions to balance innovation with creator protection. This article explains the state of play as legislatures and courts struggle to navigate these new issues. It makes forward-looking suggestions for meeting the regulatory challenges of copyright law for generative AI.
Introduction: The Collision of AI and Copyright
The arrival of generative artificial intelligence has accelerated a content creation paradigm shift. Where previously human imagination was the exclusive origin of original pieces, advanced AI models now produce text, images, code, and other creative works at scale and with rising sophistication. This technological shift has brought a legal dilemma: how can current copyright structures—established to safeguard human creative work—be applied to content produced by or with machines?
The law is only starting to grapple with this question, with nascent case law revealing tensions between intellectual property protection and technological innovation. This article analyzes the state of copyright law about generative AI, considers landmark legal controversies that shed light on central issues and suggests responses to regulatory challenges in this fast-moving area.
The Technical Basis of Generative AI
One must first grasp the technical foundations of generative AI systems to examine the legal implications accurately. Contemporary generative AI models, especially those founded on deep learning and large language models (LLMs), operate by processing enormous datasets during a “training” process. These datasets frequently consist of millions of human-generated works—from articles and books to images and code—upon which the AI inspects to determine patterns and relationships.
When a user offers input to a generative AI system, the system taps into this training to generate new material that reflects trends in the training data. This raises fundamental questions regarding whether these systems are doing something equivalent to the transformative use of copyrighted materials or if it is possible to infringe on the rights of original authors.
The Philosophical Foundation of Copyright in the AI Era
Copyright law has been traditionally defended through several philosophical models, all of which are confronted by generative AI technologies.
The Lockean theory of labor, which holds that creators are entitled to ownership rights through their labor, becomes problematic when much of the “labor” is done by an AI system. Likewise, the personality-based justification based on Hegelian philosophy—which considers creative works to be an extension of their creator’s personality—questions whether and how this applies to AI-generated work.
Under American law, Copyright is chiefly rooted in utilitarianism, in Article I, Section 8, Clause 8 of the Constitution, giving Congress authority to “promote the Progress of Science and useful Arts by securing for limited times to Authors and Inventors the exclusive Right to their respective Writings and Discoveries.” This utilitarian basis stresses reconciling incentives to creation with public advantage in enjoying creative works.
The advent of generative AI upends these conventional justifications by presenting a new category of creators: the machine. This poses fundamental questions regarding whether copyright protection is warranted for AI-generated works and, if so, who the rightful owner—the developer of the AI- is. This user initiated the creation, or maybe no one at all.
Key Legal Disputes Shaping the Field
Getty Images, INC. v. Stability AI, INC.
An exemplary legal case clarifying the conflicts between generative AI and copyright law is Getty Images, INC. v. Stability AI, INC. Getty Images, in this still-pending case, claims that Stability AI, the creator of the Stable Diffusion image synthesis model, used around 12 million images owned by Getty Images, along with related metadata, to train their AI model without permission or payment.
At its center are several significant claims:
- Unauthorized intellectual property use for AI training
- Straight competition arising out of this unauthorized use
- Trademark infringement by way of creation of images with altered Getty watermarks
- Reputational harm caused by AI-generated photos of questionable content carrying Getty’s watermark
This case breaks away from the standard copyright infringement actions based on “substantial similarity” among works. Instead, it centers around the interaction between AI training techniques and pre-existing intellectual property protections and the resultant downstream impact of AI-generated content that contains fragments of protected works.
The monetary risks are considerable; Getty Images asks for $1.8 trillion in damages, highlighting the economic prominence of this nascent legal frontier. The case is a key test of whether copyright regimes can reasonably handle AI training practices.
Doe v. Github, INC.
Another major legal battle is Doe v. Github, INC., when anonymous developers sued Microsoft, Github, and OpenAI in a class action lawsuit. Plaintiffs accuse Microsoft, Github, and OpenAI of violating Section 1202 of the Digital Millennium Copyright Act (DMCA) for allegedly using their code without permission when creating AI tools Codex and Copilot.
The developers have maintained that the firms did not adhere to open-source licensing agreements, which infringed their intellectual property rights. OpenAI and Microsoft, on the other hand, maintain that the plaintiffs have failed to support their assertions with concrete examples of harm or infringement of copyright materials.
One of the developments of note was the announcement by GitHub of steps to attribute the code generated by Copilot, potentially reducing legal hurdles around this technology. Nevertheless, the case presents broader issues about the role of AI within open-source communities and the legal requirements of businesses developing AI systems that learn from open-source code.
Critical Legal Issues at Stake
Ownership and Attribution
One of the key legal issues with generative AI is ownership. If AI generates content independently, conventional definitions of authorship are lost. Existing intellectual property legislation generally assumes a human author, so it is unclear if AI-generated works can be eligible for copyright protection and, if they can, who would own those rights.
Several potential models of ownership have arisen:
- Developer ownership: Rights belong to the organizations that developed the AI system
- User ownership: Rights are owned by users who trigger the AI to create specific content
- Joint ownership: Developers and users jointly own rights
- Public domain: AI-generated content is given no copyright protection
Each model has unique legal and practical challenges, with courts and policymakers continuing to grapple to reach a consensus.
Training Data and Fair Use
One crucial concern is whether using copyrighted material for training AI systems is fair use. The doctrine of fair use allows limited use of copyrighted material without permission for criticism, comment, news reporting, teaching, scholarship, or research.
Supporters of AI training as fair use base their argument on the fact that:
- The training process is transformative
- The entire corpus of training data is used for a different purpose than that for which it was created
- AI training is not a direct competitor to the market for original content
Opponents argue that:
- Blank consumption of copyrighted material without permission or payment inherently erodes creators’ rights
- AI-generated content can substitute functionally for human-created content, destabilizing creative markets
- Commercial AI systems gain directly from unauthorized exploitation of other people’s intellectual property
Courts have not yet conclusively determined whether training AI is fair use, with cases such as Getty Images, INC. v. Stability AI, INC. having the potential to establish key precedents.
Liability for Copyright Infringement
Another legal hurdle is establishing liability when AI systems create content that may infringe on existing Copyright. Multiple parties may potentially be held responsible:
- AI developers: For developing systems that can create infringing works
- End users: For instructing AI to generate potentially infringing works
- Platform providers: For distributing or hosting AI-generated content
The application of doctrines like contributory infringement, vicarious liability, and safe harbor provisions under the Digital Millennium Copyright Act remains unclear when dealing with AI-generated content.
Regulatory Approaches and Recommendations
Ethical Guidelines and Industry Standards
One foundation method for tackling generative AI’s copyright issues is implementing robust ethical guidelines and industry standards. This helps set clear expectations on acceptable applications, attribution, and intellectual property concerns.
Some of the essential components of such guidelines could be:
- Transparency requirements on training data sources
- Opt-out policies for creators who do not want their works to be used in training datasets
- AI-generated content attribution standards
- Best practices for licensing training data
Industry self-regulation may supplement legislative measures, enabling quicker response to changing technologies.
Hybrid Collaboration Models
Instead of perceiving AI as a substitute for human imagination, regulatory systems should promote hybrid collaboration models that situate AI as an augmentative tool to facilitate human creative processes. This will acknowledge the complementary strengths of AI and human imagination, with a potential for new creative paradigms that uphold intellectual property rights while promoting innovation.
Legal frameworks might encourage such cooperation through:
- Copyright protection models for human-AI collaborative works
- Clear attribution guidelines in collaborative situations
- Licensing frameworks for AI-supported creative processes
- International Harmonization
With the worldwide character of AI development and deployment, international harmonization of copyright practices is imperative. Without coordination, conflicting national laws might lead to compliance difficulties and regulatory arbitrage.
Encouraging cooperation among states to create harmonized standards and contracts would provide uniform protection of intellectual property rights internationally. This may include:
- International agreements dealing with AI and Copyright
- Uniform methods of AI training data licensing
- Harmonized definitions of authorship and ownership for AI-created work
Public Engagement and Stakeholder Involvement
Successful regulation demands input from various stakeholders, including creators, technology companies, legal experts, and the public. By engaging numerous viewpoints in conversations regarding AI-generated content, policymakers can ensure regulatory strategies are aligned with societal values and respond to the most impacted persons’ concerns.
Public input should be central to developing an AI environment that prioritizes creator rights while supporting technological advancement. This may include:
- Public consultations on suggested policies
- Multi-stakeholder working groups to produce policy recommendations
- Regular evaluation of regulatory effects on creative industries
The Path Forward: Balancing Innovation and Protection
While courts and lawmakers wrestle with these new questions, several principles can inform the creation of good legal frameworks for generative AI and Copyright:
- Technological neutrality: The law should target outcomes, not particular technologies, so that it can adapt as AI systems change
- Proportionality: Legal measures should be proportionate to respond to real harms without over-restricting innovation
- Transparency: Developers of AI must be transparent regarding sources of training data and use
- Creator compensation: Mechanisms for appropriately compensating creators whose works are used to train AI should be investigated
Through compliance with these principles, legal frameworks can provide a framework under which generative AI can further develop while the rights and interests of human creators are upheld.
Conclusion
The combination of generative AI and copyright law is one of the most essential legal conundrums in the digital era. As artificial intelligence systems grow to produce creative content that used to be within the protection of intellectual property, they inherently turn traditional legal orders on their heads for human creation.
Getty Images, INC. v. Stability AI, INC., and Doe v. Github, INC. demonstrate the subtle issues at stake, ranging from training data to copyright over AI-generative products. These conflicts describe the balancing act between innovation driven by technology and protection for authors that will define the development of copyright law for the next decade.
Overcoming these challenges will necessitate careful regulatory strategies that reconcile competing interests—fostering AI innovation while safeguarding the rights of human creators. This could include setting ethical standards, encouraging hybrid collaboration models, seeking international harmonization, and ensuring strong stakeholder participation in policy-making.
The law governing generative AI will evolve as courts issue rulings and lawmakers write new codes. What appears certain is that conventional copyright principles will need to be fundamentally reworked to deal with the distinctive challenges presented by AI-created content. The result of this transformation will define not just the future of AI innovation but also the character of human creativity in an increasingly AI-enabled world.
FAQ
Q1: Is AI-generated content copyrightable?
A: Existing U.S. copyright law typically mandates human authorship to be eligible for copyright protection. The U.S. Copyright Office has indicated that it will not register works generated by a machine or simple mechanical process that works randomly or automatically without a human author’s creative input or intervention. However, works involving significant human creative input that utilize AI as a tool might qualify for copyright protection. The law remains in flux on this topic.
Q2: Who retains the Copyright on material generated with generative AI?
A: This is legally unclear. Candidates for copyright ownership are the developer of the AI and the person who triggered the AI, both simultaneously, or the work is in the public domain. The courts have not yet decided on a clear precedent in this matter, and ownership can depend on how much human creative contribution was involved and the AI system being employed.
Q3: Is it legal to train AI models with copyrighted materials?
A: This is a debated question of law. Some believe training AI on copyrighted works is fair use since it is transformative and doesn’t compete directly with the original works. Others feel using copyrighted materials without permission or compensation violates creators’ rights. Cases like Getty Images v. Stability AI will likely offer judicial insight.
Q4: What are the legal risks for companies when employing generative AI?
A: Companies using generative AI face several potential legal risks, including copyright infringement claims if the AI generates content that substantially resembles protected works, trademark infringement if the AI reproduces protected marks, and potential liability for using AI trained on unauthorized datasets. Companies should implement appropriate review processes and consider obtaining legal advice before deploying generative AI in commercial contexts.
REFERENCES
- Napitupulu, P. A., Sinaga, C. A. F., & Hasugian, A. L. P. (2023). The Implication of Generative Artificial Intelligence towards Intellectual Property Rights. West Science Law and Human Rights, 1(04), 274-284.
- Getty Images, INC. v. Stability AI, INC., as cited in Napitupulu et al. (2023).
- Doe v. Github, INC., as cited in Napitupulu et al. (2023).
- U.S. Constitution, Article I, Section 8, Clause 8, as cited in Napitupulu et al. (2023).
- Digital Millennium Copyright Act (DMCA), Section 1202, as referenced in Napitupulu et al. (2023).
- Bleistein v. Donaldson Lithographing Co. (1903), as cited in Napitupulu et al. (2023).