Artificial Intelligence

OPINION

The Problem With Suing Gen AI Companies for Copyright Infringement

courtroom legal scales of justice

Microsoft’s deployment of OpenAI’s ChatGPT has been extremely popular, and its surprise implementation of this advanced generative AI system caught companies like Google and Apple napping.

Google is responding aggressively to the threat and spinning up its own generative AI solution, Bard, but both Google and OpenAI are facing class-action lawsuits alleging copyright violations related to training their AIs based on the massive amounts of data used to train these systems.

The plaintiffs in these suits likely don’t understand the implications to their careers should they be successful. I’m not talking about repercussions from Microsoft, Google, or others, but that the way they, themselves, were trained may also fall within any related ruling and result in their being sued in the future by others from whom they learned.

Let’s explore suing generative AI companies this week, and we’ll close with my Product of the Week, a new laptop from HP that may be perfect for you if you travel a lot for work.

Litigation Is Dangerous

Unfortunately, I’ve had a lot of litigation experience. I was assigned to IBM legal for a time in contracts, managed my own litigation for a couple of decades, and have been selected as an expert witness on several occasions. I also trained to be a lawyer before shifting to a very different career path.

I’ve learned that litigation isn’t anything like it is portrayed on TV. Both sides enter the courtroom with opposing views of reality, and the judge and/or jury listen to both sides before picking the most compelling argument as the winner. The winning side, which may have been in the wrong, feels vindicated, and the losing side generally feels cheated.

The outcome can have unintended and dire consequences for the losing side that can be far worse than had they left the entire thing alone in the first place or settled without a trial. Appeals generally cost around $40,000 and are rarely successful. The initial trial costs can range from over $10,000 to hundreds of thousands of dollars before judgment, and the judgments can be very costly on top of that.

So, before suing someone, you need to make not only an honest assessment of whether you are likely to win but cover any potential unintended consequences of winning or losing. This is where I think the people suing the generative AI platforms are in trouble because they are not only unlikely to win but should they win, the result may cost them their careers.

Let me explain.

How Generative AI Is Trained

Generative AI is trained by looking at massive amounts of data and patterns, which then can be turned into what we call inference which is a vastly smaller and largely federated (where the contributors to the dataset have been removed) data set that is then used as the foundation for the AI to operate.

Put another way, AIs observe digitized data at a massive scale that renders individual contributors unidentifiable. This observation process leads to the formation of an amalgamated knowledge that constitutes the AI’s brain.

Depending on the size of the resulting data set, it should be impossible — without transparency tools that do exist in some of the latest Ais — to trace the behavior back to any individual who intentionally or unintentionally supplied the training data.

For instance, learning how to be a comedian might require a training set of many comedians’ audio and video broadcasts. Based on feedback from the audience or a training operator, the AI would then learn which jokes were and were not funny. It would then derive its comedy routine from what it learned without relying exclusively on any one contributor.

The question to be answered is whether the result infringes on the copyrights of anyone who unintentionally and without permission helped create the training data set.

The Unanticipated Problem

The unexpected problem is that, like AIs, we aren’t born with an intrinsic knowledge of how to do much of anything. We learn by observing others, and our education comes from reading about events and people who were once alive or created fictionally to entertain or drive home a particular point.

When it comes to a trade like stand-up comedy, we tend to learn by watching other comics. Comedy is a career path that can lend itself to copying peers. The difference is that humans don’t have the mental capacity or time to learn from more than a handful of intentional or unintentional mentors, while a computer can consume information from thousands of individuals in a moment.

So, if computer learning from many comedians turns out to be illegal, wouldn’t it then follow that an actual human comedian learning from a far smaller number would be infringing on the rights of their comedy peers as well? The only real difference between how AI currently learns and how people learn is the speed at which the learning is accomplished and the amount of training data that is observed.

Should those suing OpenAI and Google be successful, the same case law could be used against them, resulting in what would likely be expensive penalties.

Since most work is generally learned by observing others and is potentially derivative, couldn’t anyone sue anyone else who was trained on data that originated from the plaintiff?

In other words, using the comedian premise, if these plaintiffs are successful, couldn’t other comedians sue them because of similar training methodology, and some comedians would then potentially be barred from performing for infringement should their jokes appear to have come from other people who now also want to be compensated?

Wrapping Up

Generative AI is now learning autonomously and advancing at an incredible and almost unbelievable pace. It is trained on vast data stores that may contain critical information on you and your spouse. This training process is likely in question because these systems will begin to replace many of the people who unintentionally contributed their data for training.

But given that all human knowledge and how it is conveyed effectively comes from someone else, this concept of suing for a piece of the result seems ill-conceived and could have an adverse impact on anyone who learns from others in the future.

Finally, the training set has no end of life, meaning that the collected knowledge will have a life that could go on for centuries after the contributor’s death, assuring a very limited form of digital immortality.

As a result, I doubt the plaintiffs in these cases will prevail and, should they, that the judgment could have far more damaging implications for how we are trained than anticipated.

Tech Product of the Week

HP Dragonfly Notebook PC G4

My favorite laptop of all time remains the HP Folio laptop with Qualcomm technology because of its massive 21-hour battery life. HP followed it up with an Intel-based, business-focused Folio with only around six hours of battery life, which broke my heart. HP’s new Dragonfly G4 is also Intel-based but jumps up to around 13 hours of battery life, which should be far better.

At around $1,300 for a base configuration, the HP Dragonfly G4 provides decent but not overwhelming performance for a premium-class notebook. This is a business-class notebook, and it has Intel’s vPro solution inside to assure compliance with corporate standards.

HP Dragonfly Notebook PC G4

Dragonfly Notebook PC G4 (Image Credit: HP)


This laptop has a very clean fit and finish, one of the best keyboards I’ve so far tried, a decent webcam, impressive speakers, and surprisingly good display color accuracy. The Dragonfly G4 has two unique features: “auto-camera-control” and “auto-keystone.”

The first allows you to use multiple cameras simultaneously when streaming, allowing both your face and the object(s) the other camera sees to be shown on the same image. The other enables you to view items at an angle with that second camera but have them appear like the camera was straight overhead.

Given how many people struggle in video conference meetings to show content, these two features are a godsend. I’m surprised other PC OEMs haven’t done anything similar.

At 2.2 pounds, HP’s Dragonfly G4 falls into the ultra-light category and shouldn’t make you feel like Quasimodo when carrying it in your backpack. It also has a selection of optional screens, WAN options, and a few optional Intel processors.

While I still prefer the feel and look of the HP Folio laptops, this HP Dragonfly G4 is not bad looking at all. Because of all of its improvements and increased battery life, it is my Product of the Week.

The opinions expressed in this article are those of the author and do not necessarily reflect the views of ECT News Network.

Rob Enderle

Rob Enderle has been an ECT News Network columnist since 2003. His areas of interest include AI, autonomous driving, drones, personal technology, emerging technology, regulation, litigation, M&E, and technology in politics. He has an MBA in human resources, marketing and computer science. He is also a certified management accountant. Enderle currently is president and principal analyst of the Enderle Group, a consultancy that serves the technology industry. He formerly served as a senior research fellow at Giga Information Group and Forrester. Email Rob.

Leave a Comment

Please sign in to post or reply to a comment. New users create a free account.

Technewsworld Channels