MBOX File Format Basics

MBOX or MailBox file format is a term related for file formats that are used for maintaining e-mails messages. The most common of them is mbox that may hold none to multiple emails messages. Learning the core concept will help you to perform mbox file forensics in better and in-depth way.

  • Mbox file contains linear sequence of electronic mail messages.
  • Every new message starts with line that identifies the sender, date and time (in correspond with the recipients mail server)
  • The message is terminated by an empty line.
  • After from line the message part follows the RFC 5322 format.

mbox file format
Fig1.1 MBOX collaborate format example

Mailbox formats typically takes two routes to save email messages.

  • Unite messages format: This approach involves storing of all the email messages in a single file to make up the mailbox. MMDF, mbox format and its variation (mbx, mboxrd, mboxcl, and mboxcl2) holds messages in this way.
  • Directory Format: In this approach the mail client will create individual file for individual messages within a folder/directory. MH, Maildir, and NNTP (Network News Transfer Protocol) format uses this approach to save messages.

MBOX Magic number(s):

MBOX Message Start Pattern: There is some distinct pattern of every mbox database file that you can look out for. If you have mbox database you can see the magic number of mbox is 46 72 6f 6d 20 which translate to From (From+ space bit).

mbox forensics
Fig 2. MBOX starting pattern

MBOX Ending pattern: The message or file ending pattern in .mbox format is something similar to bit 0d 0a 0d 0a 0d 0a. This hex bit typically marks the end of the message or denotes the end of the file.

mbox message file
Fig 3. Ending pattern of MBOX message/File

MBOX Variation and Core Differences

There are four known variation of .mbox file and all are mutually incompatible. The variation are:

  • mboxo
  • mboxrd
  • mboxcl
  • mboxcl2
mboxo message file

Mboxo locate the message start by searching for From Line before starting of e-mail message. If it encounter "from" line in either body or the header part of the message it then modify the message by appending ">" sign before the "from" only then the message gets stored in mbox database.

mboxrd (developed by Rahul Desai) uses "reversible from quoting" and this format is capable to differentiate between ">from" of a non-MBOXO file versus ">from" that is found in converted MBOXO file.

Maildir for network e-mail storage system

If you are dealing with network messages you might come across with Maildir files and it serves as an alternative for the mbox format when dealing with network email storage.

MBOX Header Analysis
MBOX file