Practical guides · · 12 min read

How DNS resolution actually works (and why propagation takes so long)

Tracing what happens when you visit a domain, where every cache lives, why "DNS propagation" can take 48 hours, the TTL-lowering playbook for clean migrations, and how to verify changes from outside your local cache.

By The Toolsy team

"DNS propagation" is the developer phrase for "I changed a DNS record and the change hasn't taken effect everywhere yet". It's one of those topics where most people have a vague mental model — something about caches — and that's enough to mostly work. Until it isn't, and then a deployment fails at 2 AM because example.com still resolves to the old IP for half your users.

This post walks through what happens when you type a hostname into a browser, where the caches are, why "propagation" can take 48 hours when it should take seconds, and what you can do about it.

What "resolving a domain" actually does

Let's trace what happens when you visit https://example.com for the first time. The browser needs an IP address; it doesn't have one. So it asks its resolver:

$ dig example.com +trace

; <<>> DiG 9.10.6 <<>> example.com +trace
; (1 server found)

.                  IN  NS  a.root-servers.net.
.                  IN  NS  b.root-servers.net.
... (12 more root servers)

com.               IN  NS  a.gtld-servers.net.
com.               IN  NS  b.gtld-servers.net.
... (11 more TLD servers)

example.com.       IN  NS  a.iana-servers.net.
example.com.       IN  NS  b.iana-servers.net.

example.com.       IN  A   93.184.215.14

The recursion goes:

  1. Ask a root server — "who handles .com?" Root servers don't know about example.com; they only know which servers handle the top-level domain com.
  2. Ask a .com server — "who handles example.com?" It returns the nameservers authoritative for example.com.
  3. Ask example.com's nameservers — "what's the A record for example.com?" They return the IP.

Three round-trips, each potentially across the planet. Doing this on every page load would make the web unusable. That's why every step is cached.

The cache hierarchy

Between you and example.com's authoritative server, there are at least four caches:

  1. Your browser cache. Chromium-based browsers cache DNS lookups in memory (no UI to inspect or clear; restart the browser to flush).
  2. Your OS cache. Windows, macOS, and most Linux distributions cache DNS lookups system-wide.
  3. Your local resolver. Usually your router, or sometimes the DNS server your network is configured to use (8.8.8.8, 1.1.1.1, your ISP).
  4. The recursive resolver's upstream chain. 8.8.8.8 isn't a single server; it's a fleet of thousands, sharing caches geographically.

Each cache obeys the record's TTL (Time-To-Live). The TTL is part of the DNS response — the authoritative server says "this answer is good for 3600 seconds; check back after that". Each cache holds the record for at most that long.

When the TTL expires, the cache discards the entry. The next request goes back through the recursion (or at least back to a higher cache). Most of the time, by the time the next request comes in, the record is already gone and the recursion happens fresh.

Inspecting the cache state

Useful commands:

# What does my system resolver return?
$ dig example.com
;; ANSWER SECTION:
example.com.    1832    IN    A    93.184.215.14
                 ^ TTL remaining in this cache

# Bypass system cache; ask a specific resolver directly
$ dig @8.8.8.8 example.com

# Bypass ALL caches; ask the authoritative nameserver
$ dig @a.iana-servers.net example.com

# What does another resolver see (might have different cache state)?
$ dig @1.1.1.1 example.com

The TTL in the answer section shows how long this specific cache will hold the record. A high number (close to the original TTL) means the cache fetched recently. A low number (counting down) means it's about to expire.

What "propagation" actually means

"DNS propagation" is misleading because DNS doesn't propagate — it expires. When you change a record:

  1. The authoritative nameserver updates immediately.
  2. Caches around the world keep serving the old value until their TTL runs out.
  3. As each cache expires, it asks the authoritative server again and gets the new value.

So "propagation time" is really "maximum age of the oldest cached copy". If your TTL was 3600 (1 hour) before the change, expect 1 hour for caches to catch up. If your TTL was 86400 (24 hours), expect 24 hours.

This is why you reduce the TTL before changing a record, not at the same time. The new TTL only applies to fetches that happen after the change is published. If your TTL was 86400 and you change it to 60 along with changing the IP, caches that fetched yesterday don't know about the new TTL — they'll hold the old IP for the remaining ~24 hours.

The TTL change playbook

Standard procedure for a clean DNS migration:

  1. One week before the change: lower TTL to 60 seconds. Caches that have the old (high) TTL will continue using it; new fetches start using 60.
  2. Wait one week. Or however long the highest existing TTL was. By the end, all caches have rotated and are using the 60-second TTL.
  3. Make the actual change. Caches expire within 60 seconds and pick up the new value.
  4. After confirming everything works: raise the TTL back to 3600 or 86400. Long TTLs are good in steady state — they reduce DNS load and improve performance.

Sometimes you can't do this lead-up — your registrar just went down and you need to fail over. In that case, accept that some users will hit the old IP for hours; plan accordingly (keep the old server up; serve a 503 with retry information).

The "but it works for me" problem

You change a record, refresh the site, it works. Your colleague refreshes, it still shows the old version. What gives?

Different caches. Your DNS resolver expired the entry; theirs didn't. Or you both have it expired but their browser cached the IP in its own internal cache. Or one of you went through a corporate proxy and the other didn't.

How to check from your machine:

# What does YOUR resolver currently return?
$ dig example.com

# What does Google's public resolver currently return?
$ dig @8.8.8.8 example.com

# What does Cloudflare's currently return?
$ dig @1.1.1.1 example.com

# What does the authority say (the truth)?
$ dig @ns1.example.com example.com

If they all agree, the change is live everywhere. If dig against the authority differs from dig against 8.8.8.8, the public resolver is still cached. If your local dig differs from dig @8.8.8.8, your local resolver is cached.

The website dnschecker.org queries 30+ resolvers worldwide and shows the result from each. Useful for confirming a global change has actually rolled out.

Cache flushing — last resort, not first

You can manually flush each cache, but the effect is local — it only affects your machine. For a global rollout, you can't flush every resolver in the world; you wait for TTLs.

# macOS
$ sudo dscacheutil -flushcache
$ sudo killall -HUP mDNSResponder

# Linux (systemd-resolved)
$ sudo resolvectl flush-caches

# Windows
$ ipconfig /flushdns

# Browser (Chromium)
chrome://net-internals/#dns → "Clear host cache"

Use these when debugging — to confirm you're not seeing a local cache. Not as a deployment step.

Negative caching: when "no record exists" itself gets cached

If you query for foo.example.com and the authoritative server says "no such record", that NXDOMAIN response also gets cached. The TTL for negative responses is controlled by the SOA record's minimum field, not the queried record's TTL (which doesn't exist).

This catches people who add a new subdomain after first querying it and getting NXDOMAIN. The negative cache holds for whatever the SOA minimum is — usually 5 minutes to an hour. During that time, the subdomain "doesn't exist" from the cache's perspective, even after you publish the real record.

Defense: when adding a new subdomain or record, don't pre-query it. Or be prepared to wait out the SOA minimum after publishing.

Special cases: CNAME chains, CAA, DNSSEC

CNAME chains

If www.example.com is a CNAME to example.com, and example.com is an A record, the resolver has to do TWO lookups (or one combined response, depending on the server). Both have independent TTLs. A change to the A record propagates as fast as the A record's TTL; the CNAME's TTL doesn't matter.

CAA records and certificate issuance

CAA records control which Certificate Authorities can issue TLS certs for your domain. They're checked by the CA at issuance time — not by browsers, not on every request. Changes to CAA don't have a propagation issue for end users; they affect the next time you (or someone trying to spoof your domain) requests a cert.

DNSSEC

DNSSEC adds cryptographic signatures to DNS responses. Resolvers verify the signature chain to the root. If signatures don't match (because you rotated keys without coordinating, or a registrar bug), affected zones become SERVFAIL — they appear broken everywhere. DNSSEC failures are particularly nasty because the response isn't "wrong answer" but "no answer", and they tend to cascade.

If you've enabled DNSSEC, always verify the chain after any nameserver or zone change. Tools like DNSViz show the full signature chain visually.

Practical rules

  1. Lower TTLs before planned changes. A week ahead, set TTL to 60. After the change is verified, raise back to 3600 or 86400.
  2. Don't query records before publishing them. Avoids negative caching of NXDOMAIN.
  3. Test against the authoritative server. dig @ns1.example.com example.com tells you what the truth is, bypassing all caches.
  4. Use dnschecker.org or similar for global verification. Don't trust "it works from my laptop".
  5. Keep old infrastructure up during a migration. Until all caches have rotated to the new IP, you'll still see traffic at the old one. Plan to keep it running for at least the maximum TTL that was in effect.
  6. For critical changes (apex domains), test from outside your network. Your office likely has its own DNS cache. Use a phone on mobile data, or a remote server, to confirm changes are visible from outside.

The mental model

If you internalize one thing: DNS records don't propagate; caches expire. Every cache between you and the truth has its own clock running down. "Propagation time" is "the longest clock still running", which means the highest TTL that was in effect at the time of the change.

Plan changes around expected cache lifetimes. Lower TTLs before, verify after, keep the old answer working during the transition. Most DNS-related outages I've seen come from skipping one of these three steps.

You can use our DNS record lookup to inspect records from our server's perspective (a different resolver than yours), which is often useful when you can't tell whether the issue is your local cache or the actual record.

Found this useful? Share it with a developer who'd want to read it. Have a topic to suggest? Email hello@toolsy.website.

← More posts