0xCC0FF

A companion blog to my GitHub account

Preserving IMAP keywords during e-mail migrations

Recently I had to migrate several e-mail accounts from one server to another. Some of the accounts were quite large; one had over 20,000 messages in it. I tried a few IMAP migration tools, but they all had some issue or another -- one would hang after a few thousand messages, while another one skipped messages here and there, seemingly at random.

Searching GitHub, I came across Imapcopy, a PHP-based IMAP migration script that didn't have a lot of extra bells and whistles. It appeared to just "do one thing and do it well" -- get all of the messages from one server to the other. In my testing, it worked well in almost every way; it was stable, fast, and didn't skip any messages. Unfortunately, however, it didn't preserve IMAP keywords.

IMAP keywords, also known as custom flags or tags, are user-defined labels that you can assign to messages. Messages can have more than one keyword associated with them, allowing for multi-dimensional sorting. For example, you could define "Personal", "Family", and "Friends" keywords, and assign them interchangeably to messages and sort accordingly. Additionally, many e-mail clients use some sort of visual indicator (e.g., color) to differentiate messages by keyword, allowing you to rapidly scan an inbox and see what types of messages it contains.

My primary e-mail client, Thunderbird, uses these keywords extensively, both for arbitrary user-defined tags and for core functionality. For example, forwarded messages contain the "$Forwarded" keyword, which Thunderbird uses to display a special icon next to the message. I use several custom keywords to organize messages in my inboxes, and I absolutely had to have these preserved with the messages.

So I looked at adding IMAP keyword preservation to Imapcopy. It had been awhile since I had worked with PHP, and I quickly discovered the built-in PHP IMAP e-mail functions were a bit lacking -- there was no function to directly fetch message keywords. However, you can obtain the raw IMAP message headers with the imap_headers() function. Here is a sample IMAP header as returned by imap_headers():

  A    30)28-Jun-2018 Postmaster           {NonJunk $Forwarded} Registration renewal (3245 chars)

The keywords always start at position 44 in the header, and are enclosed in braces {}. From that we can manually parse the header and obtain the keywords.

From there, it was a simple matter of adding the parsed keywords to the migrated message's header. I tested everything thoroughly (the last thing we want is a corrupted mailbox!) and everything migrated over correctly. A merged pull request later, and Imapcopy now does everything I want it to do, quickly and easily creating a perfect (as near as I can tell) duplicate of the source message and its attributes on the destination server with minimal fuss.