Subscribe now

Technology

How Elon Musk and Reddit are leading a war on AI web scraping

The long-accepted practice of search engines scraping content from websites is being re-examined now that the data is being used to build valuable artificial intelligence tools

By Matthew Sparkes

5 May 2023

AI systems are built on large data sets

BlackJack3D/Getty Images

The rapid progress in artificial intelligence in recent months is partly due to training on vast data sets of text and images, scraped for free from the internet. Although automated web scraping by search engines has been accepted by website owners for decades, the economic shift being brought about by AI has triggered a rethink.

At a basic level, search engines offer an exchange to website owners: let us scrape to compile the information and serve useful results, and we will send traffic …

To continue reading, subscribe today with our introductory offers

View introductory offers

No commitment, cancel anytime*

Offer ends 14th June 2023.

*Cancel anytime within 14 days of payment to receive a refund on unserved issues.

Inclusive of applicable taxes (VAT)

or

Existing subscribers

Sign in to your account