Goal: Provide a highly-scalable infrastructure for caching and serving content from multiple sources.
Various approaches have been used to making a site more scalable and available:
- Proxy servers
- Organizations can pass web requests to caching proxies set up inside their network. Clients contact the proxy for a copy of the content. If it does not have it, then it makes a request to the server that hosts it. After that, users who need the content can get it directly – and quickly –from the proxy server. This can help a small set of users but you’re out of luck if you are not in that organization.
- Clustering within a datacenter with a load balancer
- Multiple servers within a datacenter can be load-balanced. Incoming requests will be distributed among a set of replicated servers. However, they all fail if the datacenter loses power or internet connectivity.
- Machines can be connected with links to multiple networks served by multiple ISPs to guard against ISP failure. However, protocols that drive dynamic routing of IP packets (BGP) are often not quick enough to find new routes, resulting in service disruption.
- Mirroring at multiple sites
- The data can be served at multiple sites, with each machine’s content synchronized with the others. However, synchronization can be difficult.
All these solutions require additional capital costs. You’re building the capability to handle excess capacity and improved availability even if the traffic never comes and the faults never happen.
A content delivery network (CDN) is a service deployed across a large set of servers that caches content and distributes it to users. These servers that are placed at various points at the edges of the Internet, at various ISPs, so they can be located close to the users who access that content.
By serving content that is replicated on a collection of servers, traffic from the main (master) server is reduced. Because some of the caching servers are likely to be closer to the requesting users, network latency is reduced. Because there are multiple servers, traffic is distributed among them. Because all of the servers are unlikely to be down at the same time, availability is increased. Hence, a CDN can provide highly increased performance, scalability, and availability for content.
We will look at some parts of a CDN that’s operated by Akamai. The company was born from research at MIT that focused on “inventing a better way to deliver Internet content.” A key issue was the flash crowd problem: what if your web site becomes really popular all of a sudden? Chances are, your servers and/or ISP will be saturated and a vast number of people will not be able to access your content. This became known as the slashdot effect.
In 2021, Akamai ran on approximately 325,000 servers in around 1,400 networks distibuted across approximately 130 countries. Akamai’s traffic reaches volumes of over 30 terabits per second. Akamai’s features expanded and evolved over time. We will not attempt to cover their service. Instead, we will look at some core mechanisms as an example of how a CDN can be architected. These mechanisms are shared by CDNs from other companies as well.
Akamai tries to serve clients from nearest, available servers that are likely to have requested content. According to the company’s statistics, 85% percent of the world’s Internet users are within a single network hop of an Akamai CDN server.
To access a web site, the user’s computer first looks up a domain name via DNS. A mapping system collects information about the state of the Akamai network and locates the caching server that can serve the content. Akamai deploys custom dynamic DNS servers and customers who use Akamai’s services configure their DNS server with an alias that points to an Akamai domain name that has the company’s domain encoded within it. Akamai’s dynamic DNS servers use the requestor’s address to find the nearest edge server that is likely to hold the cached content for the requested site.
When an Akamai DNS server gets a request to resolve a host name, it chooses the IP address to return based on:
- domain name being requested
- server health
- server load
- user location
- network status
- load balancing
Akamai monitors the performance of their edge servers and can perform load shedding on specific content servers; if servers get too loaded, the DNS server will not respond with those addresses.
Now that the client has an IP address of an edge content server, it sends a request for content. That edge server may already have the content and be able to serve it directly. Otherwise, the edge server needs to get the content. Edge servers are organized into regions; a set of servers in the same data center at an ISP are know about each other and share the region. A server that needs content will first send a broadcast request to its peers within the same region. If they have the content, they can send it quickly. This is a low-overhead request with a short timeout. Akamai organizes its servers into a tiered hierarchy. If the edge server cannot find the content in its region, it will send a request to its parent server. If the parent does not have the content, it will also broadcast a request to a group of other parent servers. If the content is not found, the parent will need to contact the origin server (the server at the company that hosts the content) via its transport system.
To do this efficiently, Akamai manages an overlay network: the collection of its thousands of servers and statistics about their availability and connectivity. Akamai generates its own map of overall IP network topology based on BGP (Border Gateway Protocol) and traceroute data from various points on the network as well as probe messages between various content servers and between various Akamai content servers and origins.
Content servers report their load along with bandwidth and latency measurements to a monitoring application. The monitoring application publishes load reports to a local Akamai DNS server, which then determines which IP addresses to return when resolving names.
A CDN, serving as a caching overlay, provides three distinct benefits:
Caching: static content can be served from caches, thus reducing the load on origin servers.
Routing: by measuring latency, packet loss, and bandwidth throughout its collection of servers, The CDN can find the best route to an origin server, even if that requires forwarding the request through several of its servers instead of relying on IP routers to make the decision. These servers keep open TCP connections to avoid the overhead of setting up new connections for each request.
Security: Because all requests go to the CDN, which can handle a high capacity of requests, it absorbs any Distributed Denial-of-Service attacks (DDoS) rather than overwhelming the origin server. Moreover, any penetration attacks target the machines in the CDN rather than the origin servers. Companies (or a group within a company) that runs the CDN service is singularly focused on the IT infrastructure of the service and is likely to have the expertise to ensure that systems are properly maintained, updated, and are running the best security software. These aspects may be neglected by companies that need to focus on other services.
Analytics: CDNs monitor virtually every aspect of their operation, providing their customers with detailed reports on the quality of service, network latency, and information on where the clients are located.
Cost: CDNs cost money, which is a disadvantage. However, CDNs are an elastic service, which is one that can adapt to changing workloads by using more or fewer computing and storage resources. CDNs provide the ability to scale instantly to surges of demand practically anywhere in the world. They absorb all the traffic so you don’t have to scale up.
Video is the most bandwidth-intensiver service on the Internet and has been a huge source of growth for CDNs. As an example, Netflix operates a global CDN called OpenConnect, which contains up to 80 percent of its entire media catalog. Stored (e.g., on-demand) video is, in some ways, just like any other content that can be distributed and cached. The difference is that, for use cases that require video streaming rather than downloading (i.e., video on demand services), the video stream may be reprocessed, or transcoded, to lower bitrates to support smaller devices or lower-bandwidth connections.
Live video cannot be cached. We have seen the benefits of the fanout that CDNs offer by having many servers at the edge; a video stream can be replicated onto multiple edge servers so it doesn’t have to be sourced from the origin. CDNs can also help with progressive downloads, which is there the user can start watching content while it is still being downloaded.
Today, HTTP Live Streaming, or HLS, is the most popular protocol for streaming video. It allows the use of standard HTTP to deliver content and uses a technique called Adaptive Bitrate Coding (ABR) to deliver video. ABR support playback on various devices in different formats & bitrates. The CDN takes on the responsibility of taking the video stream and converting it into to a sequence of chunks. Each chunk represents between 2 and 10 seconds of video. The CDN then encodes each chunk at various bitrates, with can affect the quality & resolution of that chunk of video. For content delivery, the HLS protocol uses feedback from the user’s client to select the optimal chunk. It revises this constantly throughout playback since network conditions can change.