Jack W. Plunkett, CEO, Plunkett Research, Ltd.
What’s behind the massive licensing deals that are suddenly popping up between major publishers, such as News Corp., and the largest generative AI firms, such as OpenAI? (The News Corp. OpenAI deal is said to be worth about $250 million to News Corp. for certain content rights over five years. Similar recent deals include those with Associated Press, Conde’ Nast, LeMonde and Dotdash Meredith.)
Two Key Ways in Which Aritificial Intelligence Uses Third-Party Content
Understanding these developments can help you understand the future of both Gen AI and AI-based search tools, and how these booming technologies may assist, impede or contour your own work if you are a researcher, writer, analyst or publisher.
First, I believe an understanding of two nuances in AI’s use of content is vital. I’ll keep this simple:
- As you know by now, Gen AI is based on large language models (LLMs). This means that the technology can “generate” summaries, articles, responses, blog posts, etc., based on the text (written by others) that has been ingested into the LLMs. (The larger the language model, the better, and the more well-crafted the question, the better.) You might reasonably say that a brilliantly designed Gen AI system can write original materials based on background text that it has studied beforehand—in a somewhat human manner. Eventually, when meticulously engineered, and when trained on an extremely wide variety and depth of content, Gen AI systems may not need to directly plagiarize content, but instead will write very original responses. This is why the minimum investment needed to establish a LLM is considered to be $100 million plus, and I feel that even that amount is being found to be woefully inadequate by underfunded startups.
- AI can also be utilized to display (not create) full-text answers to search queries. This is not “generative,” it is AI-assisted search. Despite all of the buzz we hear about Gen AI, this amped-up search is one of the most powerful capabilities of AI when dealing with the written word, and I believe it will disrupt and redefine the way we all use search over the very near future.
AI-Driven Search Effects on Small to Mid-Size Data and Content Firms
Authors and publishers may or may not have significant rights that would enable them to stop Gen AI platforms from ingesting their works for the purpose of training their LLMs. Intellectual property attorneys and the courts of law in which they operate are in for many, many years of sorting this out. I am reminded of the uproar ignited when Google Books began scanning huge numbers of volumes with the intent of enabling Google-based search of the books’ contents. Plunkett Research happily participated initially, submitting certain of our Plunkett’s industry Almanacs in ebook format. However, before long we decided this was not a good business practice for us, as extensive and important segments of text were being displayed for free, negating the need for readers to access the books through normal commercial channels.
On the other hand, content owners may have existing rights to control the extent (beyond fair use) to which words and images from their publications can be directly quoted and displayed in search results. A desire by search companies to directly quote news, in-depth articles and up-to-the-minute images in their platforms is specifically driving the big checks that are being written to publishers. OpenAI is testing a beta of a SearchGPT tool, exactly to power the next generation of search results.
Today, we remain in Wild West-like days of competition in AI platforms and related law. Authors and publishers may add “No AI Training Without a License” notices to their works, which we now do with Plunkett’s Industry Almanacs. This may be of little force and effect. Also, the robots.txt section of a website’s HTML can hold similar restrictions.
Platforms and Services Emerge to Manage Content Licensing
On the other hand, many website owners may want to encourage AI platform referrals (hopefully with links) to their content, in which case they can design their pages with layers of subheads (e.g., H2 and H3 segments in HTML) that help guide AI software to a rapid understanding of the category of the blocks of text that are displayed. Meanwhile, not surprisingly, at least a few web-based services have sprung up to act as brokers between publishers (large and small) and AI companies. Their services may include model licenses and assistance in building API connections to publishers’ data—behind paywalls. Such companies include Tollbit and ScalePost. Getty, owners of iStock and other digital image platforms, has taken things further by entering into multiple major contracts enabling Gen AI companies to train on the photos, videos and art that are contained in Getty platforms—thus enabling participating photographers and artists who created the images to be paid automatic (and modest) royalties for AI usage.
Near-Term Trends in the Relationship Between Content Providers, Search Engines and AI Engines
OpenAI and its competitors are moving with blinding speed in launching disruptive AI tools. While the use of AI tools such as Perplexity and ChatGPT for business, creative and schoolwork purposes on desktops and laptops is already very common, smart phone makers are scrambling to determine the best way to serve the needs of billions of everyday consumers while on their phones.
The low-hanging fruit here includes:
- Better photo editing and mobile vide making through AI
- Improved text creation and email writing
- Writing mobile emails with more clarity and fewer mistakes
More importantly from a commercialization perspective, vastly better mobile search has arrived. Apple recently launched a respectable number of AI tools for its latest iPhones that include the ability for a user to leave the Applesphere and go to ChatGPT for complex AI data searches. Mobile advertising will rapidly change as well, thanks to AI-driven analysis of users’ real-time searches. Protection of consumer privacy has become an enormous issue, particularly in the E.U. While major players in search advertising are being forced by government regulators to rely less on tracking of consumers through the planting of cookies, AI offers entirely new possibilities of understanding the interests of internet users and serving up relevant ads on-the-fly as well as relevant content. Search users may benefit from better results, while content creators can benefit from website traffic and advertising income.
For the latest Plunkett Research coverage of the Artificial intelligence and Ecommerce industries, visit www.plunkettresearch.com .
Copyright © 2024, Plunkett Research, Ltd., All Rights Reserved