cybersecurity, palm print, data security, firewall, hacker, malware, ransomware, hacking, cybersecurity, cybersecurity, cybersecurity, cybersecurity, ransomware, ransomware, ransomware, ransomware, ransomware
| |

Cloudflare has accused Perplexity AI of circumventing firewalls and scraping websites by modifying its user agent.

Perplexity AI, an emerging question-answering engine powered by advanced large language models, has recently faced scrutiny for employing stealth crawling techniques that circumvent standard web defences. Initially, Perplexity’s crawlers operated transparently, identifying themselves with user agents like PerplexityBot/1.0 and adhering to robots.txt directives and web application firewall (WAF) rules. However, in early August 2025, researchers discovered that when blocked, Perplexity began altering its identity mid-crawl, adopting generic browser user agents and unannounced IP ranges to access restricted content. Cloudflare analysts noted that this behavioural shift indicated a deliberate evasion tactic rather than an inadvertent misconfiguration. The system impersonated Chrome on macOS, issuing requests that included user agent strings designed to mask its true identity. This change allowed Perplexity to maintain persistent access, resulting in significant ramifications for website operators who had explicitly disallowed its crawlers.

The implications of Perplexity’s actions are profound, as they undermine core internet principles and raise legal and policy questions regarding the sourcing of AI training data. Content owners now struggle to differentiate between legitimate human traffic and obfuscated AI crawlers, complicating compliance with privacy regulations and copyright protections. Furthermore, Perplexity’s fallback strategy—utilising alternative data sources when direct crawling fails—demonstrates its adaptive persistence. Although this approach generates answers based on secondary websites, it often lacks the specificity of original content. The sophisticated persistence of Perplexity relies on dynamic user agent rotation and rapid ASN hopping, allowing it to evade signature-based firewall rules. Cloudflare researchers identified that these stealth crawlers maintain session continuity by preserving cookies and referrer headers during identity changes, effectively masquerading as individual human users. Mitigation strategies must focus on behavioural analysis to flag anomalous patterns, such as high request velocity and uniform inter-request timing. 

Similar Posts

Leave a Reply