The Decoupler: Who Defines the Web User in the Age of AI Gatekeepers?

By TopGPTHub··13 min read
The Decoupler: Who Defines the Web User in the Age of AI Gatekeepers?

Those who manage the gateway are redefining who counts as a "user."

Many website SEO teams are still looking at the same old dashboards: organic search traffic, page views, time on site, and subscription conversions. However, this set of metrics is increasingly failing to explain one thing: content may still be heavily consumed, but traffic is not necessarily returning to the original site.

This is the real reason why Cloudflare CEO Matthew Prince’s statement is so noteworthy. At SXSW, he stated that at the current rate of AI growth, bot traffic could exceed human traffic by 2027. The real point isn't the year itself, but the fact that the trend he is pointing to is becoming clearer: more and more web activity is no longer "humans going to websites to find answers," but "AI agents crawling websites to find answers on behalf of humans."

Furthermore, this isn't just a slogan from a stage speech. As early as an official article in September 2025, Cloudflare had already proposed a more conservative judgment: expecting machine traffic to exceed human traffic by the end of 2029. By March 2026, Matthew Prince’s public statement moved that timeline up to 2027. This suggests that a more robust interpretation should not treat 2027 as a certainty, but rather as a strong signal from Cloudflare's management: this transformation may be arriving much faster than originally anticipated.

Because of this, what deserves attention next is not just how the total volume of traffic changes, but how the core logic of website operation must be rewritten when content consumption and value return become gradually decoupled.

Key Interpretation:

What truly matters about Matthew Prince's statement regarding "machine traffic potentially exceeding human traffic by 2027" is not the year itself, but that AI agents have begun to push web traffic from "humans clicking into sites" toward "machines crawling sites for humans." What Cloudflare has introduced this year is not a single anti-scraping tool, but an entire product line ranging from blocking, identification, and usage tagging to licensing and pricing. More accurately, it is attempting to become the negotiation layer between AI and content websites. For publishers, content sites, and enterprise knowledge bases, the real risk is not just more bots, but the fact that "content is crawled, traffic never returns, and rights remain unclear"—the old search exchange model is fracturing.

This time, it’s not just more bots; it’s AI agents acting on behalf of humans.

In its 2025 annual review, Cloudflare categorized AI crawling into three types: training, search, and user action. What is truly shifting the power of the gateway is not the crawling used for model training, but "user action"—where a system crawls a site, supplements data, and organizes answers directly to respond to a user's prompt. Cloudflare pointed out that this type of traffic grew significantly throughout 2025, meaning agents are no longer just a background behavior but have started to become a part of the frontend user experience.

This change will lead to a fundamental shift in the role of a website. In the past, the logic of a website was to wait for readers to click in; now, websites increasingly resemble data endpoints being continuously invoked by various machines. Therefore, the real question is no longer "are bots increasing," but "who is exercising the right to browse on behalf of the user?"

The turning point is not total traffic volume, but the shift of gateway power.

The old open web had a long-standing tacit agreement: search engines could crawl content, but in exchange, they would send people back to the site. The website bore the cost of being crawled in exchange for exposure, ad revenue, subscription conversions, or at least brand memory. Today, this exchange model is loosening.

When Reuters reported on Cloudflare’s launch of "Pay Per Crawl" last July, it cited data indicating that Google’s crawl-to-refer ratio (the ratio between content crawled by a platform and actual traffic sent back) had deteriorated from 6:1 six months prior to 18:1. For OpenAI, it was as high as 1,500:1. The signal is direct: more is being crawled, but fewer people are being sent back. This is no longer just a matter of technical efficiency; it is a crack in the value exchange itself.

Cloudflare later added more comprehensive observations in its 2025 annual review. it noted that Anthropic once reached a crawl-to-refer ratio as high as 500,000:1 in 2025; OpenAI reached 3,700:1 during peak periods; while Perplexity was relatively much lower, mostly staying below 200:1 after September 2025. These numbers may not be suitable for direct "moral rankings," but they are perfect for judging a more important fact: in the AI era, the stable exchange relationship between crawling and referral that existed in traditional search no longer holds.

In other words, it’s not just traffic that is moving—it’s the gateway power. Users may not stop needing content, but they increasingly don't need to visit the source of that content personally.

Publishers feel more than just "harder SEO"—the entire "Traffic Era" is fracturing.

This change is already directly reflected in media traffic. According to Chartbeat data, traditional search referrals for small publishers have dropped 60% over the past two years, medium publishers by 47%, and large publishers by 22%. The same data shows that page views from Google Search fell by 34% and Google Discover by 15%. Meanwhile, although referral page views from ChatGPT grew by over 200%, they still represent less than 1% of all publisher referral traffic.

These figures show us one thing: content is still being used, but the way it is used has changed, and the original mechanism relying on click-throughs has not compensated proportionally.

The "Journalism, Media and Technology Trends and Predictions 2026" published by the Reuters Institute for the Study of Journalism this January also noted that publishers expect traffic from search engines to drop by another 43% over the next three years. The Guardian, summarizing the same report, further mentioned that referral traffic to news sites from Google Search has decreased by 33% year-on-year, while traffic from ChatGPT remains very limited.

This is why more and more publishers are no longer just saying "SEO is getting harder," but are realizing a deeper problem: what is fracturing is not just ranking mechanisms, but the entire order of content distribution. In other words, the question is no longer how to rank higher, but whether value can effectively flow back to the original content provider after content has been read, summarized, and invoked.

Cloudflare is not just warning of risks; it is turning bot management into access governance.

If Cloudflare only had its CEO shout a warning, it would have limited weight. What is truly noteworthy is that over the past year and a half, almost all of its product lines have converged in the same direction, and the signals are consistent.

In March 2025, Cloudflare launched AI Labyrinth, using AI-generated fake pages to slow down and confuse bots that do not follow no-crawl directives. This is no longer simple blocking, but directing non-compliant crawling behavior into a high-cost zone, ensuring that even if they enter, they pay a price.

By July 2025, Cloudflare pushed the default logic a step further. It announced that AI companies can no longer treat website content as "crawlable by default"; they must first check if the site owner has explicitly granted permission ("AI companies will now be required to obtain explicit permission from a website before scraping"). For newly added domains, the system will first ask the webmaster whether to allow AI crawlers. The real change here is not just a toggle on an interface, but the fact that the default for content access is being rewritten: moving from the old "crawlable by default, opt-out" to a "permission-based model" requiring explicit consent.

Then in August 2025, Cloudflare officially renamed AI Audit to AI Crawl Control and added customizable HTTP 402 responses. This allows webmasters not just to block, but to state their position directly: if you want to crawl this content, negotiate a license first. At this stage, what Cloudflare wants to handle is no longer just bot defense, but the interface between crawling, licensing, and commercial negotiation.

In February 2026, Cloudflare launched Markdown for Agents. On the surface, this looks like converting HTML into Markdown for easier reading by agents; but more importantly, it reveals a deeper signal: Cloudflare has begun designing content transmission formats with agents as "first-class users." The official statement is clear—a more structured format like Markdown is more efficient for AI systems and saves tokens.

Looking at these moves together, Cloudflare is clearly moving from "helping sites block bad bots" to "helping sites define which machines can enter, why they can enter, what format they should use, and whether they need to authorize or pay after entering." While this is not yet a fully validated industry standard, the direction is clear from the trajectory of product evolution.

This is not necessarily the end of the open web, but the exchange model is being recalculated.

We should not interpret this wave of change simply as "the end of the website era." That would be too hasty and too arbitrary.

First, bots are not new. Search engine crawlers have existed for years and were part of the growth of the open web. The real issue has never been "is the site being crawled," but "can value flow back after being crawled."

Second, a decline in search referrals does not mean all websites will collapse together. Data shows that while search referrals have dropped significantly, average weekly page views for global publishers only fell by 6% between 2024 and 2025. Large publishers are holding on not because they are unaffected, but because they hold other levers to sustain traffic, such as direct traffic, newsletters, apps, membership systems, and brand habit.

Finally, the impact varies across content types. The Guardian’s summary of the Reuters Institute report notes that lifestyle, celebrity, and travel content—which is easily summarized and replaced—is under greater pressure. Breaking news and current affairs content, by contrast, still retain some protection. This means AI is not compressing all content equally, but prioritized in compressing standardized, replaceable content that lacks a strong brand and direct relationship.

So, a more accurate statement is not "the open web is dying," but that the old search exchange model is failing, while a new licensing and distribution model has yet to truly take shape.

The next wave of conflict will fall on the institutional level, not just the product level.

This also means the problem can no longer be handled with a robots.txt file or a single firewall rule. Cloudflare's "Content Signals Policy" makes it clear: these content usage signals are essentially expressions of preference and are not mandatory. If the other party ignores them, site operators still need to use WAF and bot management tools. This is an admission that the real issue now is not just technical control, but where the boundaries of rights are drawn and who holds the power of enforcement.

Regulatory direction is also pushing this issue toward the institutional level. Reuters reported on March 18 that Google is developing new search control mechanisms to allow websites to opt-out of generative AI features, responding to concerns from the UK Competition and Markets Authority (CMA). Google has also stated publicly that it is updating mechanisms to allow sites to specifically choose not to be included in generative AI features within search.

The truly sensitive point of contention is this: Can a website refuse use for AI summaries without being penalized in traditional search visibility? If not, it isn't a real choice; if so, the web might slowly grow a new order of negotiation. Therefore, what needs to be established next is not just new tools, but new boundary standards: which crawling belongs to indexing, which to real-time generation, and which to model training; and whether refusing one use will impact another form of exposure. These are no longer engineering details, but the core of the next round of web governance.

Treat your website as if it serves two types of customers.

For media, research firms, B2B SaaS companies, educational platforms, technical documentation sites, and content-driven e-commerce, the most dangerous miscalculation today is still understanding a website only as "pages for humans."

In fact, more and more websites are serving two customers simultaneously: humans and machines. If you only optimize for the human reading experience without managing machine crawling purposes, costs, and return conditions, you are likely supplying models and agents without clear authorization or returns. The problem is not just traffic loss, but that your content value has been incorporated into someone else’s answering system, distribution system, and training process, while your own exchange terms remain blank.

This does not mean all websites should block everything. What truly needs to be done is to re-categorize content and redefine exchange conditions. This can start with three decision-making questions:

1. Where do you most want your content to be used? In search indexing, real-time answers, or model training? The business significance of these three is different, and the licensing conditions shouldn't be the same. Search indexing might exchange for discovery; real-time answers for brand exposure and some referrals; while model training involves long-term value transfer and rights arrangements.

2. If a bot crawls heavily but refers little, how will you handle it? Will you block it, charge it, or accept it as a brand distribution cost? This is a management judgment, not just a technical setting. Without this judgment, teams can only passively endure crawling without distinguishing which crawls are worth the exchange and which are one-way extractions.

3. If search referrals drop another 20% to 30% in the next 12 months, can you accept it? Newsletters, members, apps, social media, APIs, courses, consulting, licensing income—which is ready to take over? The point of this question is not just alternative traffic sources, but checking whether your business model is still overly dependent on a single gateway.

Conclusion: Cloudflare’s warning points to a shift from a Click Economy to a Negotiation Economy.

This interpretation isn't saying "the world will be taken over by bots," but that the primary subject of web traffic is shifting from human browsing to machine-mediated browsing. When AI agents start comparing prices, researching data, and integrating answers for people, a website is no longer just a page waiting for a reader to click; it is a content endpoint continuously invoked by machines.

What Cloudflare is doing now—from AI Labyrinth and AI Crawl Control to 402 responses and Markdown for Agents—is a clear product evolution line. It reveals a business judgment: the next round of competition is not just in model capability, but in who can become the more trusted negotiation layer between AI and the open web.

For decision-makers, the useful thing isn't asking "will 2027 come true," but asking: if search referrals drop another 20-30%, will your organization continue to treat the website as a traffic gateway, or start treating it as a management interface for content rights, machine access, and long-term relationships?

What we should observe next is not just the percentage of bot traffic, but two things:

  1. Whether the crawl-to-refer ratio of various platforms continues to worsen.
  2. Whether websites can finally split crawling purposes into different rights layers such as search crawling, real-time answers, and model training.

Because what is truly being rewritten is never just traffic, but the entire exchange logic of the internet.


Share this article

Related Posts