Previous Documents Reveal OpenAI's Stance on U.S. AI Copyright Laws
Created on October 5|Last edited on October 5
Comment
This document submitted by OpenAI to the United States Patent and Trademark Office (USPTO) in 2019 serves as a compelling prelude to the whirlwind of technological and commercial advancements that were to follow. The submission meticulously unravels the legal and policy dimensions surrounding the use of copyrighted material in the training of Artificial Intelligence (AI) systems.
At the time of this submission, OpenAI was perched at the cusp of monumental changes — both in the capabilities of AI technology and in its own journey towards commercialization. This makes the viewpoints presented in the document especially intriguing. They offer a snapshot of the legal and ethical considerations just before AI technologies, including those by OpenAI, exploded onto the global stage in terms of both power and market presence. The submission addresses three key points:
1. Fair Use Under Current Law: OpenAI argues that the training of AI systems using copyrighted material falls under the fair use doctrine as defined in 17 U.S.C. § 107. They emphasize the "transformative" nature of the AI training process, which is a crucial factor considered in legal assessments of fair use.
2. Policy Considerations Support Fair Use: OpenAI states that the policy objectives underlying the fair use doctrine, which include the promotion of science and arts, are furthered when AI systems are trained on existing works. They argue that the transformative use of copyrighted material for training AI aligns well with these objectives.
3. Need for Legal Clarity: OpenAI notes that the current legal ambiguity surrounding the copyright implications of training AI systems imposes a burden on AI developers. They argue for the need for authoritative resolution to reduce the costs and uncertainties faced by developers.
The Four Factors of Fair Use
The document delves into the four-factor test for fair use, specifically focusing on its implications for AI training. The "four-factor test for fair use" is a legal framework used to determine whether a particular use of copyrighted material is considered "fair use" under U.S. copyright law, and thereby exempt from infringement liability.
1. Purpose and Character of Use: AI training is inherently transformative, argues the document. While the copyrighted works are primarily intended for human consumption, their use in AI training serves an entirely different goal: to help the model understand patterns in human-generated media. Furthermore, commercial intent does not necessarily preclude fair use, as courts have ruled in the past.
2. Nature of the Copyrighted Work: This factor considers whether the copyrighted work is fictional or non-fictional, but it rarely influences the fair use determination. For AI, which could be trained on various forms of media, this factor becomes even less significant.
3. Amount and Substantiality: While AI systems might use nearly the entire content of the copyrighted works, the document argues that the key question is not the quantity but whether the public has access to it as a competing substitute. If the corpora used in training aren't publicly accessible, this leans in favor of fair use.
The reasoning behind this is that if the corpora are kept private and not shared with the public, then they are not serving as a substitute for the original copyrighted material. In other words, people can't use the corpora as an alternative to purchasing or legally obtaining the original works, so the market for those original works is not negatively impacted.
4. Effect on Market Value: The document posits that AI training should not impact the market value of copyrighted works since they are consumed by machines and not humans.
Analogous Legal Cases
The document cites prior cases to bolster its argument. The well-known Authors Guild v. Google case found that Google’s digital scanning of millions of books for its searchable database was fair use. HathiTrust, which similarly scanned whole copyrighted books into a searchable database, also passed the fair use test. In this instance, the emphasis was on the transformative nature of the work, aligning with the first factor of the fair use test.
A Stronger Case for AI
Finally, the document asserts that AI training is even more transformative than these prior cases. It goes beyond increasing access to copyrighted works by using the data to generate entirely new content.
In conclusion, the document not only builds a compelling case for viewing AI training through the lens of the fair use doctrine but also sets the stage for a fascinating legal frontier. It underlines the challenges and opportunities posed by the collision of artificial intelligence and intellectual property law, highlighting the complexities tied to diverse forms of media and an ever-changing legal landscape. Given the transformative role that AI is playing in society, it becomes difficult to envision a future where its progression is stifled by stringent copyright laws. Such an outcome could dampen innovation, constrain scientific discoveries, and limit the technology's potential to tackle some of humanity's most pressing issues. Therefore, the discussion around fair use and AI training is not merely academic; it's a pivotal legal quandary that could dictate the pace at which we advance into a new era of machine intelligence.
The document:
Add a comment
Tags: ML News
Iterate on AI agents and models faster. Try Weights & Biases today.