Cloudflare Experiences Outage on November 18, 2025

admin-cdn2 hours ago

0 0 1 minute read

On November 18, 2025, at 11:20 UTC, Cloudflare experienced a major outage affecting core network traffic. Internet users encountered error pages when attempting to access customer sites, signaling a failure within Cloudflare’s network.

Incident Overview

The outage was not a result of cyber attacks or malicious activity. Instead, a change to database permissions generated multiple entries in a “feature file” critical for the Bot Management system, leading to the file’s unexpected size increase. This exceeded software limits on the devices routing traffic, causing significant disruptions.

Timeline of Events

11:05 UTC: A change to database access control was implemented.
11:28 UTC: The first errors appeared as the deployment affected customer environments.
11:32 – 13:05 UTC: The team began investigating escalating errors and traffic to Workers KV, suspecting it as the initial cause.
13:05 UTC: A bypass for Workers KV and Cloudflare Access was implemented to reduce impact.
14:30 UTC: Main disruptions were resolved, and core traffic flow began recovery.
17:06 UTC: All affected services were restored to normal operations.

Technical Details of the Outage

The malfunction stemmed from a database query change in the ClickHouse system, which inadvertently resulted in the generation of duplicate feature rows. This modification expanded the feature file size beyond operational limits, triggering errors throughout the network.

During the outage, several services faced significant disruptions:

Core CDN and Security Services: Users experienced HTTP 5xx error responses.
Turnstile: Failed to load, causing log-in issues for users.
Workers KV: Encountered elevated HTTP 5xx error rates due to proxy failures.
Dashboard: Limited operational capacity, affecting user login capabilities.
Email Security: Reduced accuracy in spam detection during the incident.
Access: Authentication failures were widespread, impacting user logins.

Restoration Efforts

Cloudflare promptly initiated a restoration plan, which included halting the propagation of newly created feature files and replacing them with a known operational version. The team also focused on manually restarting components to stabilize services.

Post-Incident Actions

In response to the incident, Cloudflare aims to improve system resilience and prevent similar outages in the future. Key considerations include:

Strengthening ingestion protocols for configuration files.
Implementing global fail-safes for critical features.
Reviewing error handling mechanisms across core systems.

This outage marks one of the most significant failures in Cloudflare’s history since 2019, emphasizing the importance of maintaining resilient and dependable systems for global internet infrastructure. Cloudflare extends apologies to its customers and acknowledges the frustrations caused by this incident.

admin-cdn2 hours ago

0 0 1 minute read