CRC-32 vs Adler-32 vs MD5 for File Checksums: When to Use Which
Not every checksum needs to defend against an attacker. Sometimes you just want to know whether a file was corrupted in transit, whether two memory pages are duplicates, or whether to invalidate a cache. For those non-adversarial integrity checks, MD5 and SHA-256 are overkill. The right tool is one of the fast 32-bit hashes: CRC-32, Adler-32, FNV-1a, or xxHash.
Each has a specific niche. Picking the wrong one isn't usually catastrophic, but it's an easy place to leave performance or reliability on the table.
Quick reference
- CRC-32: best general-purpose error detection. Used in Ethernet, ZIP, PNG, gzip. Hardware-accelerated on most CPUs.
- Adler-32: faster on small data, weaker error detection. Used in zlib's RFC 1950 wrapper.
- FNV-1a: simple, fast, good distribution. Common in hash tables.
- xxHash: modern, extremely fast (10+ GB/s). Use for content-addressable storage where speed matters more than universal compatibility.
- MD5: only if you specifically need 128-bit output without security. Otherwise overkill for non-adversarial checksums.
CRC-32: the classic
CRC-32 (Cyclic Redundancy Check, 32-bit) was invented for digital communications. The variant most commonly called "CRC-32" today is the one specified in IEEE 802.3 (Ethernet), with polynomial 0xEDB88320. It's used in:
- Ethernet frames (every packet on your network)
- ZIP and gzip archives (one CRC per file)
- PNG image files
- SATA, USB, and many other hardware protocols
- Linux's
cksumutility
CRC-32 has a precisely-known error detection profile: it catches every burst error of 32 bits or fewer, every odd number of bit-flips, and most longer burst errors. This deterministic guarantee is why it's the default in low-level protocols where flipped bits are the failure mode you care about.
Modern x86_64 CPUs implement CRC-32 in a single hardware instruction (crc32q, part of SSE 4.2 since 2008). ARM has equivalent instructions since ARMv8.1. On hardware with these extensions, CRC-32 runs at 20-40 GB/s — faster than most disks can read.
Adler-32: the fast one
Adler-32 was designed by Mark Adler (yes, same Adler) in 1995 for zlib. The algorithm is simpler than CRC-32: two running 16-bit sums, modulo 65521. The hash is the concatenation of the two sums.
Adler-32 is faster than software CRC-32 on data >64 bytes, but slower than hardware-accelerated CRC-32. Its error detection is weaker — it can miss some specific patterns that CRC-32 catches reliably. Notably, for very short inputs (under ~256 bytes) Adler-32's effective output space is much less than 32 bits because the sums don't have time to mix.
Where you'll encounter Adler-32:
- zlib's RFC 1950 wrapper format (used inside HTTP gzip, PNG's IDAT chunks, and the .Z and .gz file formats use different wrappers; RFC 1950 specifically is found in HTTP
Content-Encoding: deflate) - Rsync's rolling checksum (a variant of Adler-32 is rsync's first-pass weak hash)
For new code, Adler-32 has been largely superseded. CRC-32 is faster on modern hardware and has stronger guarantees; xxHash is faster on large data without hardware support.
The benchmark
Approximate throughput on a 2024 server CPU (numbers from various published benchmarks; your mileage will vary):
| Algorithm | Throughput (single thread) | Hardware accelerated? |
|---|---|---|
| CRC-32 (SSE 4.2) | 25-40 GB/s | Yes (CRC32 instruction) |
| CRC-32 (software) | 1-3 GB/s | No |
| Adler-32 | 3-5 GB/s | No |
| xxHash64 | 20-30 GB/s | No (just well-designed) |
| FNV-1a 32 | 1-2 GB/s | No |
| MD5 | 500-800 MB/s | Rare |
| SHA-256 (SHA-NI) | 1-2 GB/s | Yes (Intel SHA-NI) |
| SHA-256 (software) | 200-400 MB/s | No |
Collision rates
All 32-bit hashes have 2³² ≈ 4 billion possible output values. By the birthday paradox, you expect collisions among random inputs after about √(2³²) ≈ 65,536 items. For 1 million items, you'll have hundreds of collisions on average. This is fine for error detection (you're checking a known expected value, not searching for collisions) but problematic if you're using the hash as a unique identifier for millions of objects.
Empirical collision rates from published benchmarks over short random inputs:
| Input length | Adler-32 effective bits | CRC-32 effective bits |
|---|---|---|
| 1 byte | ~8 | ~8 |
| 4 bytes | ~18 | ~32 (full) |
| 16 bytes | ~22 | ~32 (full) |
| 256 bytes | ~28 | ~32 (full) |
CRC-32 hits its full 32-bit output space at very short input lengths and distributes evenly. Adler-32 takes much longer to "mix" — for small messages, its effective entropy is significantly less than 32 bits.
Which one to use for what
Detecting accidental corruption in network/storage data
CRC-32. Hardware-accelerated, mathematically guaranteed error detection, universal library support. Default choice.
Building a custom protocol from scratch
CRC-32 if your platform has hardware acceleration; xxHash64 otherwise. xxHash gives you 64-bit output (much less collision risk for large data sets) and runs at memory bandwidth without needing special CPU support.
Hashing keys for in-memory hash tables
FNV-1a or xxHash. Both have good distribution properties and are extremely simple to implement. Most language standard libraries already use one of these (or a slight variant) internally for HashMap / dict / HashSet.
Content-addressable storage at scale
SHA-256. If you're storing millions of objects and using the hash as the address, 32-bit hashes have too many collisions. Even xxHash64's 64-bit output is borderline for billion-scale systems. Use SHA-256 (or BLAKE3 if speed is critical and you control both ends).
"I just want a fast checksum for development"
Whatever your platform makes easy. Node.js: crc32 package. Python: zlib.crc32(). Go: hash/crc32. Don't optimize this prematurely — almost any choice is fine if you're not at the scale where 100 MB/s vs 10 GB/s matters.
What you should NOT use these for
- Verifying file authenticity from untrusted sources. An attacker can craft a CRC-32 collision in microseconds. Use SHA-256.
- Password storage. Use bcrypt or Argon2id.
- Digital signatures. Use SHA-256 or stronger.
- Anything where someone benefits from a collision. 32 bits is too small. Period.
Compute CRC-32, Adler-32, FNV-1a, and DJB2 on any text instantly with our All Algorithms tool, or hash an entire file to see how fast these algorithms are with the File Hash tool.