File Hashing & Checksums: Verify Downloads and Detect Changes
· 10 min read
What Is a File Hash?
A hash function takes any input (a file, text, or data) and produces a fixed-length string of characters — the hash or checksum. Key properties:
- Deterministic — Same input always produces the same hash
- Fixed length — Output is always the same size regardless of input (SHA-256 = 64 hex characters)
- Avalanche effect — Changing one bit of input completely changes the hash
- One-way — You cannot reverse a hash to get the original data
- Collision resistant — It should be practically impossible to find two different inputs with the same hash
Use our File Hash Checker to compute hashes for any file in your browser.
Hash Algorithms Compared
| Algorithm | Output Length | Speed | Security | Status |
|---|---|---|---|---|
| MD5 | 128 bit (32 hex) | Very fast | Broken (collisions found) | Legacy only |
| SHA-1 | 160 bit (40 hex) | Fast | Broken (2017 collision) | Deprecated |
| SHA-256 | 256 bit (64 hex) | Fast | Secure | Recommended |
| SHA-512 | 512 bit (128 hex) | Fast (64-bit CPUs) | Secure | Recommended |
| SHA-3 | Variable | Moderate | Secure (different design) | Alternative |
| BLAKE3 | 256 bit | Fastest | Secure | Modern choice |
For file verification, SHA-256 is the standard. MD5 is still used for quick integrity checks (not security) because it is fast and widely supported.
Verifying Downloads
Software distributors publish checksums alongside downloads. To verify:
- Download the file and the checksum (usually a .sha256 or .md5 file)
- Compute the hash of your downloaded file
- Compare your computed hash with the published one
- If they match, the file is identical to what was published
This protects against: corrupted downloads (network errors), tampered files (man-in-the-middle attacks), and mirror integrity (third-party download sites).
Command Line Tools
# macOS / Linux
shasum -a 256 file.zip # SHA-256
md5sum file.zip # MD5 (Linux)
md5 file.zip # MD5 (macOS)
# Verify against published checksum
echo "abc123... file.zip" | shasum -a 256 -c
# Windows (PowerShell)
Get-FileHash file.zip -Algorithm SHA256
# Multiple files
shasum -a 256 *.iso
# Compare two files (are they identical?)
diff <(shasum -a 256 file1.zip) <(shasum -a 256 file2.zip)
Python
import hashlib
def file_hash(path, algo='sha256'):
h = hashlib.new(algo)
with open(path, 'rb') as f:
for chunk in iter(lambda: f.read(8192), b''):
h.update(chunk)
return h.hexdigest()
print(file_hash('download.zip')) # SHA-256
print(file_hash('download.zip', 'md5')) # MD5
Practical Uses Beyond Download Verification
| Use Case | How Hashing Helps |
|---|---|
| Password storage | Stores hash instead of plaintext (with salt + bcrypt/argon2) |
| Deduplication | Identify identical files by comparing hashes instead of content |
| Git version control | Every commit, tree, and blob is identified by its SHA-1 hash |
| Blockchain | Each block contains the hash of the previous block, forming a chain |
| Digital signatures | Sign the hash of a document instead of the entire document |
| Content addressing | IPFS and Docker use content hashes as identifiers |
Frequently Asked Questions
Is MD5 still safe to use?
MD5 is broken for security purposes (collision attacks are practical). It is still acceptable for non-security integrity checks like verifying file transfers, but SHA-256 is preferred for all new applications.
What does "collision" mean in hashing?
A collision occurs when two different inputs produce the same hash output. For secure hash functions, finding collisions should be computationally infeasible. MD5 and SHA-1 have known collision attacks.
Why are checksums different lengths?
Different algorithms produce different output sizes. MD5 outputs 128 bits (32 hex chars), SHA-256 outputs 256 bits (64 hex chars). Longer hashes provide more collision resistance.
Can I recover a file from its hash?
No. Hash functions are one-way — you cannot reverse them to get the original data. The hash is a fixed-size fingerprint, not an encoding of the data.
Which hash algorithm should I use?
SHA-256 for most purposes. BLAKE3 if you need maximum speed. SHA-512 for extra security margin. Never use MD5 or SHA-1 for security-critical applications.