FNV-1a vs DJB2: Picking a Hash Function for Hash Tables
When you write users["alice"] in your program, the language runtime needs to convert the string "alice" into a number — a bucket index in the underlying hash table. The function that does this conversion is the hash function, and the requirements are simple: fast, deterministic, and distributes inputs reasonably evenly across all possible bucket indices.
Two functions show up in textbooks more than any other: FNV-1a and DJB2. Both are short, simple, fast, and used in real production code. Let's compare.
Quick comparison
- FNV-1a: better distribution, marginally slower, the modern preference
- DJB2: extremely simple, slightly faster, slightly worse distribution on adversarial inputs
- For production hash tables in modern languages: use whatever your stdlib uses (often a SipHash or xxHash variant)
- For learning, teaching, or embedded systems: FNV-1a is the better default
FNV-1a: Fowler-Noll-Vo
FNV-1a was designed by Glenn Fowler, Phong Vo, and Landon Curt Noll in 1991, named for its three authors. The 32-bit variant is:
function fnv1a_32(bytes):
hash = 0x811C9DC5 // FNV-1a offset basis
for each byte b in bytes:
hash = hash XOR b
hash = hash × 0x01000193 // FNV prime
hash = hash mod 2^32
return hash
Three lines of real work. Two magic constants (the offset basis and the prime). Both are mathematically chosen for good distribution properties; nothing about them is arbitrary.
Why it works
The XOR-then-multiply pattern is what gives FNV-1a its avalanche behavior. XORing with the input byte introduces variation; multiplying by a large prime smears that variation across all 32 bits of the hash. After a few bytes of input, the hash is well-mixed.
FNV-1 (without the "a") does the multiplication before the XOR — same operations, different order. The "1a" variant has slightly better avalanche properties and is what most implementations use today.
DJB2: Bernstein's hash
DJB2 was published in comp.lang.c by Daniel J. Bernstein (the "djb" of DJB2) in 1991. It's even simpler:
function djb2(bytes):
hash = 5381
for each byte b in bytes:
hash = (hash × 33) + b
hash = hash mod 2^32
return hash
One magic constant (5381), one multiplier (33). That's it. The multiplier can be computed as a left-shift-and-add: hash × 33 = (hash << 5) + hash, which is what makes DJB2 fast on processors without fast multipliers.
There's a variant called "djb2a" (or "djb2 XOR") that replaces the addition with XOR. Slightly better mixing on some inputs, same speed.
How they compare in practice
Speed
Both algorithms run at multiple GB/s on modern CPUs. DJB2 has a slight edge because its constant (33) can be implemented as a shift+add instead of a multiplication on architectures without a fast multiplier — but modern x86 and ARM both have single-cycle multiplications, so this advantage is largely historical.
Distribution
For random inputs, both hashes distribute essentially uniformly over the 32-bit output space. For structured inputs (real-world strings like usernames, file paths, English words), FNV-1a tends to have slightly better distribution because its avalanche behavior is more aggressive.
Empirically, with 1 million English words hashed into a 1 million-bucket hash table:
- FNV-1a: average ~1.0 collisions per bucket (essentially optimal)
- DJB2: average ~1.0 collisions per bucket (also essentially optimal)
The difference only shows up on adversarial inputs — strings specifically constructed to collide. Neither algorithm is resistant to a determined attacker who wants to create a denial-of-service by overflowing a single bucket.
Hash flooding
"Hash flooding" is a DoS attack where the attacker submits many inputs that all hash to the same bucket. Lookup performance degrades from O(1) to O(n), and a server processing each request linearly scans an ever-growing collision list.
This attack was famously demonstrated against PHP, Python, and Java in 2011, leading to widespread adoption of randomized hash functions like SipHash. SipHash takes a secret key (chosen randomly at program startup) and includes it in the hash computation — so two attackers can't precompute colliding inputs without knowing the key.
FNV-1a and DJB2 do not have this property. If you're hashing untrusted input that ends up in a public hash table, neither is safe. Use whatever your language's standard library uses (which is usually SipHash these days).
When to use which
Use FNV-1a when:
- You need a hash function in a context where the standard library's hash isn't available (embedded systems, low-level protocols, in another language).
- You want good general-purpose distribution without performance tuning.
- You're hashing trusted input (internal IDs, application-controlled strings).
- Compatibility with existing FNV-1a data matters (some on-disk formats specify it).
Use DJB2 when:
- You're on extremely constrained hardware without a hardware multiplier.
- You're working with code that already uses DJB2 and consistency matters.
- You're studying hash functions and want the simplest possible example.
Don't use either when:
- The hash protects against adversaries. Use SipHash, xxHash3 with a secret key, or your language's randomized stdlib hash.
- You need to detect collisions in large data sets. 32 bits is too small. Use SHA-256 or xxHash64.
- You're hashing for security. Use SHA-256, BLAKE2, or BLAKE3.
What modern languages actually use
| Language / runtime | Hash function for strings |
|---|---|
| Python 3.4+ | SipHash-2-4 |
| Ruby 2.0+ | SipHash |
| Rust (default) | SipHash-1-3 (configurable) |
| Go (built-in maps) | memhash (aesenc-based on x86) |
| Java HashMap | String.hashCode (variant of DJB2 with multiplier 31) |
| JavaScript V8 | internal, varies by string length |
| PHP 7.4+ | SipHash |
Java's String.hashCode() is the closest commonly-encountered relative of DJB2 — same one-line pattern, with multiplier 31 instead of 33. Modern Java's HashMap adds a "scramble" step on top of this to defend against hash flooding.
The bigger picture
For learning purposes, FNV-1a and DJB2 are great. They fit on a napkin, they work, and they distribute well for normal inputs. For production code, prefer your language's built-in hash function for strings — it's been tuned for your specific runtime, often includes SIMD acceleration, and includes randomization to defend against hash flooding.
Where you'll legitimately reach for FNV-1a or DJB2 in 2026: embedded systems, low-level networking code, file format implementations that specify them, and educational contexts. That's a narrower niche than 20 years ago, but it's a real one.
See FNV-1a and DJB2 computed in real time with our FNV-1a generator and DJB2 generator. Or compare all 10 hash algorithms on the same input with the All Algorithms tool.