By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
Daily HacklyDaily HacklyDaily Hackly
  • Tech & Digital Trends
  • Entertainment & Lifestyle
  • Money & Smart Living
  • Productivity & Life Hacks
Search
  • Contact
  • Blog
  • Complaint
  • Advertise
© 2022 Foxiz News Network. Ruby Design Company. All Rights Reserved.
Reading: The Battle Between AI Web Scraping and Online Defense Mechanisms
Share
Sign In
Notification Show More
Font ResizerAa
Daily HacklyDaily Hackly
Font ResizerAa
Search
  • Home
    • Home 4
  • Categories
  • Bookmarks
  • More Foxiz
    • Sitemap
Have an existing account? Sign In
Follow US
  • Contact
  • Blog
  • Complaint
  • Advertise
© 2022 Foxiz News Network. Ruby Design Company. All Rights Reserved.
Daily Hackly > Blog > Tech & Digital Trends > The Battle Between AI Web Scraping and Online Defense Mechanisms
Tech & Digital Trends

The Battle Between AI Web Scraping and Online Defense Mechanisms

DailyHackly
Last updated: July 14, 2025 9:45 am
DailyHackly
Share
The Battle Between AI Web Scraping and Online Defense Mechanisms
SHARE

Understanding the Reality of AI Data Scraping

Artificial Intelligence isn’t a whimsical invention. The applications that can create essays or hyper-realistic videos from straightforward prompts owe their capabilities to vast training datasets. This data originates from various online sources, primarily authored by humans.

The internet is an enormous reservoir of information. Last year, the web was reported to house 149 zettabytes of data. To put that into perspective, that translates to 149 million petabytes, or 1.49 trillion terabytes, or 149 trillion gigabytes—essentially an astronomical amount. This diverse collection of text, images, videos, and audio files is highly sought after by AI companies striving to enhance and expand their models.

As a result, AI systems continuously scour the internet, collecting any accessible data to improve their neural networks. Recognizing the lucrative opportunity, certain firms have struck agreements to monetize their data with AI organizations, including entities like Reddit, the Associated Press, and Vox Media. The approach of AI companies often involves scraping data without prior consent, prompting a backlash from various organizations that have initiated legal actions against firms like OpenAI, Google, and Anthropic. (Notably, Ziff Davis, the parent company of DailyHackly, filed a lawsuit against OpenAI in April, claiming copyright infringement related to AI training operations.)

Such legal actions have not impeded the relentless data-collection efforts of AI systems. There is an increasing urgency for more data; recent studies indicate that AI models may exhaust their necessary data by 2028, limiting the window for AI companies to gather from the vast web. While alternative data sources like formal partnerships and synthetic data may provide some relief, the internet remains an invaluable asset for these companies.

For many users active online, it is likely their personal data has been harvested by these AI systems. Although this feels disconcerting, it is the very fuel driving the chatbots widely adopted over the past few years.

The Internet’s Resistance

Nonetheless, despite the pressing challenges facing the online landscape, there is a growing resistance against such practices. Particularly, there are efforts to shield smaller entities from the impact of AI data scraping.

In a remarkable display of ingenuity, a web developer has created a solution to prevent AI bots from indiscriminately harvesting data from their websites. The tool, known as Anubis, was launched earlier this year and has already been downloaded over 200,000 times.

Developed by Xe Iaso, who operates out of Ottawa, Canada, as reported by 404 Media, Anubis was inspired by her experience with an Amazon bot that was crawling her Git server. Rather than shutting down her server entirely, she experimented with various tactics and eventually created a method to effectively block such bots through what she refers to as an “uncaptcha.”

The operation of Anubis is straightforward: Upon activation on a website, the tool verifies new visitors as humans by utilizing cryptographic computations via JavaScript. Most modern browsers easily complete this task, whereas bots often lack the required coding capacities to perform such extensive cryptographic work en masse. This allows Iaso to block bots while seamlessly permitting genuine users.

This solution is tailored for web administrators rather than the typical internet user. Furthermore, it is entirely free and open-source, with ongoing enhancements planned. Iaso informed 404 Media that although she cannot dedicate her time entirely to Anubis, she is brainstorming updates, including a testing mechanism that minimizes CPU strain on users and an alternative that doesn’t rely on JavaScript, considering some users disable it for privacy reasons.

For those interested in deploying Anubis on their servers, detailed guidance is available on Iaso’s GitHub page. Additionally, users can test their own browsers to verify their identities against bot detection.

Iaso is not alone in this effort; Cloudflare recently began blocking AI crawlers by default and enabling clients to charge AI companies desiring to collect data from their platforms. As mechanisms become more effective in preventing AI firms from unimpeded data harvesting, it’s plausible these companies may retract their aggressive scraping strategies or, at a minimum, offer website owners greater compensation for their data.

The hope is to encounter more sites that initially display the Anubis verification screen. If navigating to a link leads to the “Verifying you are not a bot” notification, it signals that the site successfully safeguards against these AI crawlers. For a period, AI seemed insurmountable, but now there exists a capacity to challenge its unchecked expansion.

© 2023 Information and Insights

You Might Also Like

Sony WH-1000XM5 Headphones Discounted for Prime Day 2025 Promotions

Unbeatable Spring Sale: Grab My Top Robot Vacuum at 50% Off on Amazon!

Exploring 10 Uncommon Yet Remarkable Features of AirTags

Understanding the Functionality of Password Managers and the Benefits of Their Adoption

New Clicks Keyboard Case Now Released for Android Users

TAGGED:bataille numériquecybersécuritéHere are some SEO-optimized tags in French for your post title: grattage de donnéesIAintelligence artificiellemécanismes de défense en ligneprotection des donnéesscrapingsécurité webtechnologie

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.
[mc4wp_form]
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Copy Link Print
Share
Previous Article Unprecedented Discount on the Boox Note Air 4C for Prime Day Celebration Unprecedented Discount on the Boox Note Air 4C for Prime Day Celebration
Next Article Unbeatable Offer: Grab the Sturdy Anker Mini Bluetooth Speaker for Only $28 This Prime Day! Unbeatable Offer: Grab the Sturdy Anker Mini Bluetooth Speaker for Only $28 This Prime Day!
Leave a Comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Stay Connected

248.1kLike
69.1kFollow
134kPin
54.3kFollow
banner banner
Create an Amazing Newspaper
Discover thousands of options, easy to customize layouts, one-click to import demo and much more.
Learn More

Latest News

Final Opportunity to Snag AirPods Max on Prime Day
Final Opportunity to Snag AirPods Max on Prime Day
Tech & Digital Trends
Massive 41% Discount on Sony’s Portable Bluetooth Speaker on Prime Day’s Final Day
Massive 41% Discount on Sony’s Portable Bluetooth Speaker on Prime Day’s Final Day
Tech & Digital Trends
Unbeatable Deals on Sony WH-CH720N Headphones This Prime Day 2025
Unbeatable Deals on Sony WH-CH720N Headphones This Prime Day 2025
Tech & Digital Trends
Final Day of Prime Day: Grab the Mac Mini from Apple for Just $500!
Final Day of Prime Day: Grab the Mac Mini from Apple for Just $500!
Tech & Digital Trends
//

We influence 20 million users and is the number one business and technology news network on the planet

Quick Link

  • Contact
  • Blog
  • Complaint
  • Advertise

Support

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

[mc4wp_form id=”1616″]

Daily HacklyDaily Hackly
Follow US
© 2022 Foxiz News Network. Ruby Design Company. All Rights Reserved.
Join Us!
Subscribe to our newsletter and never miss our latest news, podcasts etc..
[mc4wp_form]
Zero spam, Unsubscribe at any time.
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?