Seminal Paper of the Week — a foundational systems paper that quietly shapes how every distributed system you use is layered.

Authors: Jerome H. Saltzer, David P. Reed, David D. Clark (MIT) Published: ACM Transactions on Computer Systems 2(4), November 1984. Canonical link: End-to-End Arguments in System Design (MIT) · ACM DOI 10.1145/357401.357402

TL;DR

The end-to-end argument is a layering principle: a function should be implemented in a lower layer of a system only when it can be completely and correctly implemented at that layer, and when implementing it there provides a clear performance benefit over implementing it at the endpoints. For most application-level guarantees — reliable delivery, duplicate suppression, encryption, message ordering, deduplication, integrity — the lower layer can only ever approximate the guarantee, because it does not have enough application context. The endpoints have to do the work anyway, so doing it in the network is at best a performance optimisation and at worst a mis-allocated responsibility. This single observation, twelve pages long, is the reason TCP/IP looks the way it does, the reason the web works, the reason microservices use idempotency keys, and the reason “smart network, dumb endpoint” architectures keep losing.

The argument, in one example

The paper anchors the entire idea on a deceptively boring example: reliable file transfer. You want host A to send a file to host B, and you want B to end up with exactly the same bits A had.

Now suppose the network gives you a “reliable” link-layer protocol — every packet is acknowledged, retransmitted on loss, checksummed. Is the file transfer reliable?

No. Consider the failure points:

  • The file might be corrupted by a bug in A’s filesystem code before it’s even sent.
  • A bus error or memory corruption in A could flip bits between disk read and network send.
  • Same on B’s side: between network receive and disk write, anything could go wrong.
  • The reliable link layer covers exactly one hop. If there are routers in between, each store-and-forward step is unprotected unless every hop runs the same protocol.

So even with a “reliable” lower layer, A still has to compute a checksum over the file as it sat on disk, B still has to verify that checksum after writing to its disk, and B still has to be able to ask A to retransmit if it doesn’t match. The lower layer’s reliability did not eliminate the end-to-end check — it could not, because it has no knowledge of the disks or the application semantics. The end-to-end check is necessary. The lower layer’s reliability is therefore at most a performance optimisation: it might make retries cheaper by catching corruption earlier, but it cannot replace the end-to-end check, and if it’s expensive (acks, retransmits, link-level state), it may not even be worth doing.

That’s the whole argument. Generalised: the function in question can only be correctly implemented with the knowledge and help of the application at the endpoints. Implementing it in the communication system below is therefore unnecessary as a correctness measure and only sometimes justified as a performance one.

Why this paper is the load-bearing wall of modern systems

The end-to-end argument is the philosophical justification for the entire shape of the Internet:

  • TCP runs in the endpoints, not in routers. Routers do best-effort forwarding; reliability, ordering, congestion control all live in the OS at the endpoints. This is the architectural choice that made the Internet scalable to billions of hosts, because routers stayed simple and stateless about flows.
  • The fate-sharing principle of TCP (Clark, “The Design Philosophy of the DARPA Internet Protocols”, 1988) is the end-to-end argument applied to connection state: state about a connection lives where the connection terminates, so the only thing that can lose a connection’s state is one of its endpoints — and if an endpoint is gone, the connection is meaningless anyway.
  • TLS is end-to-end encryption because per-hop encryption cannot give you what you actually want (confidentiality from intermediaries) — a result that maps perfectly onto the paper’s framing.
  • The web’s “dumb pipes, smart endpoints” position is the end-to-end argument as a business model. Network-level innovation is constrained; the action moves to the endpoints; this is why HTTP wins over X.25 and ATM, why CDNs are an endpoint-flavoured compromise, and why every attempt to make the network “smart” (active networks, ATM with QoS, deep packet inspection) has either lost or been confined to constrained environments.

Where the argument keeps showing up — even far from networking

The end-to-end argument is not actually about networks. It’s about where in a layered system a function has to live to be correct. Once you see it, you see it everywhere:

  • Idempotency keys in REST APIs. The server cannot deduplicate retries reliably from network metadata alone — the client has to attach an end-to-end identifier that survives proxy retries, load-balancer retries, and the client’s own retry loop. The deduplication must happen at the layer that defined “this is the same logical request”, which is the application.
  • Exactly-once message processing in Kafka / SQS / event streams. The transport can give at-least-once or at-most-once; “exactly once” semantics require an end-to-end deduplication step at the consumer, because only the consumer knows what “the same message, semantically” means.
  • Database transactions vs application-level invariants. A database transaction guarantees serialisability; it does not guarantee your business invariant. The check has to happen at the layer that knows the invariant — the application — even if the database also does some of the work.
  • Encryption at rest vs end-to-end encryption. Disk encryption protects against one specific threat (stolen disks). It does not protect against a compromised host process, because the OS decrypts on read. The application has to encrypt if the application is the one that needs the guarantee.
  • Distributed consensus and the FLP impossibility. You cannot push linearisability into the network; the consensus has to happen at a layer that knows what “agreement” means for the workload.

In every case the structure is identical to the file-transfer example: the lower layer has a guarantee that looks like what you want, but doesn’t have the context to fully provide it, and the endpoint has to do the work anyway. So pushing the function down is either redundant or wrong.

What it does not say

A common misreading is that the paper argues against putting any function in the lower layer. It doesn’t. It argues against putting functions there as a correctness claim. Performance is explicitly allowed: link-level retransmits in WiFi exist because they make the end-to-end retry cheaper, not because they replace it; congestion control gets feedback from the network because end-to-end probing alone is too slow.

The paper also doesn’t say lower layers should be “dumb.” It says they should be targeted: do the things you can do completely and which provide enough performance benefit to justify the complexity. Don’t do things you can only partially do.

Why this still matters, in 2026

We are in the middle of an industry-wide rediscovery of this principle, and it is being rediscovered the hard way.

  • Service meshes keep trying to subsume application concerns (auth, retries, idempotency, encryption) into a sidecar. The end-to-end argument predicts where that bet breaks: anywhere the application has semantic context the sidecar lacks. Retries done by the mesh without idempotency keys generated by the application produce duplicate side effects. We’ve all seen the bill.
  • “AI gateways” and inference proxies are trying to do prompt deduplication, output caching, safety filtering, billing, and auth, all in a layer that does not understand the conversation’s semantics. The paper would predict — and reality is confirming — that the application has to do most of those checks anyway, and the gateway is either a performance accelerator or a liability.
  • Federated and edge-AI architectures are end-to-end arguments dressed up: the user device has knowledge (raw data, user intent) that no intermediary has, so functions that require that knowledge — personalisation, privacy guarantees — must live on the device.

The paper is twelve pages, no math. It is one of the highest density-per-page artifacts in computer science, and every engineer designing a layered system should be able to recite the file-transfer example from memory. The questions it asks — can this lower layer completely provide this guarantee? does the endpoint have to do it anyway? — are still the right ones to ask before adding a feature to your network, your platform, your service mesh, or your AI gateway.

Where to read it


Part of the Weekly CS Paper Digest series. The seminal pick is a longer handwritten piece each week, rotating across systems areas; see the weekly-papers index for prior picks.