mbox vs Maildir vs Database: How Mail Actually Sits on Disk
Most email security writing skips the storage layer. People talk about TLS, about SPF, about brute-force protection on login — and then quietly assume that whatever happens to the message after delivery is some other team's problem. It usually isn't. The storage layer is where backups go wrong, where forensics happens, where encryption-at-rest decisions live, and where a successful attacker who got file-level access reads everything.
This is post 5 of the email security series. We're going to look at mbox, Maildir, mdbox, and database-backed storage, with a hex dump or directory listing for each. Then we'll talk about encryption at rest, and what an attacker with read access to your mail directory actually gets.
mbox: a single file, decades of locking pain
mbox is the original Unix mailbox format, going back to the 1970s. The whole inbox is a single text file. Each message is concatenated, separated by a special line that starts with
From Here's what an mbox file looks like in
xxd00000000: 4672 6f6d 2073 656e 6465 7240 6578 7465 From sender@exte
00000010: 726e 616c 2e74 6573 7420 5475 6520 4170 rnal.test Tue Ap
00000020: 7220 3232 2031 343a 3132 3a32 3920 3230 r 22 14:12:29 20
00000030: 3236 0a52 6574 7572 6e2d 5061 7468 3a20 26.Return-Path:
00000040: 3c73 656e 6465 7240 6578 7465 726e 616c <sender@external
00000050: 2e74 6573 743e 0a52 6563 6569 7665 643a .test>.Received:
...
00000234: 5468 6973 2069 7320 6120 7465 7374 2e0a This is a test..
00000244: 0a46 726f 6d20 7361 6c6c 7940 7368 6f70 .From sally@shop
00000254: 2e74 6573 7420 5475 6520 4170 7220 3232 .test Tue Apr 22The line
From sender@external.test Tue Apr 22 14:12:29 2026From This raises an immediate question: what if the message body itself contains a line starting with
From >>From From >From >>From The bigger problem is locking. Two processes writing to the same mbox at the same time will corrupt it. The history of mbox locking is the history of every Unix locking mechanism, applied haphazardly:
- — fast, advisory, but doesn't work over NFS
flock() - /
fcntl()— works over (some) NFS, but harder to use correctlylockf() - Dotlock — create as an exclusive marker file, with a stale-lock cleanup heuristic. Works everywhere, but the cleanup heuristic is itself a race condition
mailbox.lock
A modern Linux mail setup that uses mbox typically uses all three simultaneously to be safe (Dovecot calls this "fcntl + flock + dotlock"). Performance under load suffers proportionally.
mbox is mostly historical now. You'll find it in:
- Legacy Unix systems with /var/spool/mail/ delivery
- Mailing-list archives (where the single-file format is convenient for offline analysis)
- People who haven't migrated yet
If you're starting fresh in 2026, do not use mbox. Use Maildir.
Maildir: one file per message
Maildir (designed by Dan Bernstein for qmail) replaced mbox's concatenated single file with a directory containing one file per message, plus a clever rename-based atomic delivery scheme.
A Maildir looks like:
~/Maildir/
├── tmp/ # in-progress deliveries
├── new/ # delivered, not yet seen by user
└── cur/ # seen, with flags encoded in filenameHere's
treeMaildir/
├── tmp/
├── new/
│ └── 1714233149.M123P456.mail.example.com
└── cur/
├── 1714230020.M999P888.mail.example.com:2,S
├── 1714231233.M111P222.mail.example.com:2,SR
└── 1714232889.M333P444.mail.example.com:2,STThree things are happening here.
The filename is the unique ID. The format is
<unix-time>.M<microseconds>P<pid>.<hostname>Atomic delivery via rename. When
lmtptmp/<unique-id>rename()new/<unique-id>renamenew/Flags in the filename. The colon-2 suffix is the message flags.
:2,S:2,SR:2,T:2,:2,SThe standard flag letters are:
- — Draft
D - — Flagged
F - — Passed (forwarded/redirected)
P - — Replied
R - — Seen
S - — Trashed
T
This design solves the locking problem at the cost of inode pressure on the filesystem. A user with 100,000 messages has 100,000 files in their Maildir, which made some old filesystems sad. ext4, XFS, and ZFS handle it fine in 2026.
mdbox: the compromise
Dovecot's
mdbox~/mdbox/
├── mailboxes/
│ └── INBOX/
│ ├── dbox-Mails/
│ │ ├── m.1
│ │ ├── m.2
│ │ └── ...
│ ├── dovecot.index
│ ├── dovecot.index.cache
│ └── dovecot.index.logEach
m.NThe advantages:
- Massively fewer inodes (better for filesystems with millions of messages)
- Faster sequential reads (whole bundles fit in disk readahead)
- Better compression (bundles can be transparently compressed)
The cost:
- Corruption of one bundle file affects every message in it
- Backup is harder — restoring one message means understanding the bundle format
- It's Dovecot-specific — moving to another IMAP server requires conversion
mdbox is a sensible default for high-volume installations. For a small self-hosted setup, Maildir is simpler and the performance difference doesn't matter.
Database storage
Storing mail in a database (PostgreSQL, MySQL) is a thing some people do. Dovecot supports it through the
dict- Mail is large, sequential, write-once, read-rarely data — exactly the workload row stores are bad at
- Database engines weren't designed for hundreds of GB of mostly-unchanging blobs per user
- Backup of a multi-TB mail database is significantly harder than backup of a multi-TB filesystem
- Disaster recovery: if your database dies, every user's mail is offline. With Maildir, individual user mailboxes are independent files that can be restored one at a time
I've seen people put mail in databases. I've seen most of them migrate back. Use the filesystem.
Encryption at rest
Three options, in increasing order of security and complexity:
Full-disk encryption. LUKS on Linux. Encrypts the whole volume. Defends against an attacker who steals the physical drive but not against any attacker who can
catFilesystem-level encryption. ZFS native encryption, ext4 fscrypt, or eCryptfs. Per-user keys can be unlocked at login. Defends against an attacker with read access to the disk via another user's account, as long as the target user isn't logged in. Useful in shared-tenant scenarios.
Per-message encryption (Dovecot's mail_crypt
mail_crypt- Server-side search becomes impossible without decrypting (defeating Dovecot's index cache)
- Spam filtering at delivery time is complicated (most filters need the plaintext)
- Key management is now a thing you have to do — losing a user's private key means losing their mail
For most self-hosted setups, full-disk encryption + careful access control to the mail directory is enough. For threat models where "the server itself might be compromised" is realistic,
mail_cryptA quick
mail_cryptmail_plugins = $mail_plugins mail_crypt
plugin {
mail_crypt_curve = secp521r1
mail_crypt_save_version = 2
mail_crypt_require_encrypted_user_key = yes
}Each user gets a keypair generated on first login (encrypted with the user's password, so the password is required to decrypt). Read the Dovecot docs before deploying — there are several modes (per-user vs per-folder vs global keys) with different tradeoffs.
The forensic angle
Imagine an attacker who got file-level read access to your mail directory (compromised webmail server, stolen backup tape, lost-and-found laptop). What can they read?
| Storage | Attacker reads message contents? | Attacker reads metadata? | |---------|----------------------------------|--------------------------| | Maildir, no encryption | Yes, all of it | Yes (filenames have flags) | | Maildir + LUKS, server off | No | No | | Maildir + LUKS, server on | Yes — running system has key | Yes | | Maildir + mail_crypt | No (encrypted blobs) | Yes (filenames, sizes) | | mdbox + mail_crypt | No (encrypted) | Partial — index file leaks subjects |
Two things stand out:
Full-disk encryption only protects you against offline attacks. The most common compromise isn't a stolen drive — it's a compromised account or service with read access on the running system. LUKS does nothing for that.
Index files leak metadata. Dovecot's index cache can store subject lines and other header fragments, plaintext, separately from the messages themselves. If you turn on
mail_cryptBackup considerations by format
- mbox — back up the file. Easy to copy. Hard to back up while the server is running (locking against the live server). Snapshot the filesystem (LVM, ZFS) before backup.
- Maildir — back up the directory. is your friend. Each file is independent, so partial backups recover gracefully. Watch out for filesystems with bad small-file performance on the backup side.
rsync - mdbox — back up the bundles and index. Consistency between bundle files and index files matters; snapshot the filesystem first if the server is running. Don't try to incremental-rsync mdbox without thinking — bundles are rewritten on compaction and rsync may copy the entire file each time.
- Database — use the database's native backup tool (,
pg_dump). Plan for the size — multi-TB dumps take hours.mysqldump
Gotcha: Maildir over NFS
Maildir was designed to work safely over NFS if the NFS server preserves rename semantics correctly and the client uses the right options. Dovecot has a
mail_nfs_storage = yesIf you're running Dovecot on a single host with local storage, ignore this. If you're running a cluster of Dovecot front-ends sharing a Maildir over NFS — read the docs three times before deploying.
What I'd Tell My Past Self
The storage layer is where most of the operational pain in self-hosted mail comes from. It's where backups break, where index corruption strikes during power loss, where a
chmodIf you're starting fresh, my recommendations:
- Use Maildir (simple, robust) or mdbox (faster at scale)
- Use full-disk encryption, always
- Add only if your threat model genuinely includes server compromise
mail_crypt - Test your backups by actually restoring from them, not just by checking that they ran
Post 6 takes us into the CVE history. Now that you know what's at stake on disk and on the wire, the bug patterns will make a lot more sense.
Discussion
0 comments
Share your thoughts
No comments yet. Be the first to share your thoughts!