High Noon Showdown: AI Content Theft

Paul Gerbino, President of Creative Licensing International, details the profound moral and ethical issues surrounding AI, content scraping, and copyright infringements. At stake is the entire value proposition of media. This feature first appeared on the CLI blog and is re-published here with kind permission.

The internet, once hailed as a utopia of information sharing, is facing a copyright crisis fueled by the rise of Artificial Intelligence (AI). At the heart of the issue lies a fundamental disagreement: are the vast troves of online content truly “freeware,” as some AI leaders suggest, or protected by intellectual property laws requiring permissions and royalties?

In our last issue of the Content Licensing Brief, we published “A Patriotic View of Copyright”. Marie Griffin wrote about the constant tension between intellectual property rights and the free flow of information that has existed since the founding of the USA. She explains that copyright laws attempt to navigate this complexity.

Then I read a couple of articles. The first was from Search Engine Land, Microsoft AI CEO: Web content is ‘freeware’, the second on Forbes, “AI Startup Perplexity Is Directly Ripping Off Content From News Outlets (forbes.com), and then on Gizmodo, Perplexity Is Reportedly Letting Its AI Break a Basic Rule of the Internet (gizmodo.com). They reinforced the reality that there is moral and ethical relativism going on when it comes to copyright.

This relativism gained traction in my mind with comments by Microsoft AI CEO Mustafa Suleyman, who declared web content “freeware.” This statement stands in stark contrast to Microsoft’s vigorous defense of their own copyrighted software. Adding fuel to the fire is Perplexity, an AI startup, which reportedly disregards the Robots Exclusion Protocol (REP) – a cornerstone of web etiquette – to scrape content for training its AI models, even when website owners explicitly forbid such access.

Destroying Intellectual Property

These developments have left content creators scrambling. Ricky Sutton, of Future Media, likened the situation to “the greatest destruction of human intellectual property since the Library of Alexandria.” While perhaps an exaggeration, it underscores a genuine concern for the livelihood of those who generate the content on which AI thrives.

The crux of the problem lies in a historical misunderstanding. Media companies, by offering “open web access,” inadvertently fostered the perception that online content is free. While readily available, this content was never truly free. It cost money to create, curate, and host – often subsidized by advertising. However, weak enforcement of copyright notices and terms of use – and the media’s own mistake by referring to their web content as “free” – created a gray area that AI companies are now exploiting.

Even a search on Bing reveals the legal complexities involved. Copyright law remains unsettled regarding AI. Training AI models often involves copying copyrighted material, potentially infringing upon intellectual property rights. Further complicating matters, AI outputs can sometimes resemble existing works, blurring the lines of fair use.

Some believe existing copyright laws are sufficient to combat content theft by AI. A recent Supreme Court case involving Andy Warhol’s artwork demonstrates the court’s willingness to uphold copyright, even in cases with artistic transformation. However, a separate case involving comedian Sarah Silverman suggests the courts may be hesitant to apply copyright law to certain AI outputs.

The difficulty lies in proving “substantial similarity” between original works and AI outputs. Traditional copyright claims hinge on direct comparisons between works, which become murky with AI-generated content. This ambiguity necessitates a reevaluation of the legal landscape.

Holding AI to Account

Should copyright laws be rewritten to address the challenges posed by AI? Can regulatory bodies interpret existing laws to provide better protection for content creators? Recent Supreme Court rulings limiting agency interpretation of laws make the latter seem unlikely.

One thing is clear: the current trajectory is unsustainable. Content creators cannot be exploited to fuel the development of AI models that will ultimately generate profit. Publishers, media companies, and all creators of original content are the lifeblood of the AI industry. AI companies should be held accountable by paying a fair price for the very words, images, and sounds that power their advancements.

The future of AI innovation hinges on establishing a balance between technological progress and respect for intellectual property. We must foster a collaborative environment where creators and AI developers can work together to ensure a future where both can thrive. Copyright, and the investment of creators, must be respected.

Paul Gerbino
President, Creative Licensing International

Creative Licensing International (CLI) is an innovative consultancy in the information industry, founded in the fall of 2014 with the tagline “License, Monetize, Innovate.” CLI helps clients drive incremental revenue, increase brand reach, and leverage the monetary value of all forms of content across various delivery options. Clients include content creators and publishers in the B2B, B2C, and STM markets as well as the scholarly and academic market.

High Noon Showdown: AI Content Theft

Destroying Intellectual Property

Holding AI to Account

Leave a Reply Cancel reply