Message retention policies

Archiveopteryx can prevent mail from being deleted, and also force mail to be deleted. You can specify both kinds of policies by mailbox, by search, or a combination, and you can specify as many policies as you want, even several policies for a single mailbox.

Example

This complete example shows how to set up a reasonably complex policy.

We wish to archive all mail for 90-120 days. Mail in the /archive/… mailboxes has to be archived for five years. Unread nonspam in /users/… is kept for at least nine months.

First, we set up the general rules. All mail is retained for 90 days:

aox retain mail 90

And all mail (except what must still be retained) is deleted after 120:

aox delete mail 120

So far, Archiveopteryx allows users to delete mail after 90 days, and if a message hasn't been deleted yet after 120 days, aox vacuum deletes it.

Next, we add the specific rule for /archive/…:

aox retain mail 1827 /archive

(1827 days is five years, and we won't discuss February 29.) The most complex rule is that for /users/…:

aox retain mail 270 /users ( not flag seen ) and ( not flag junk )

(270 days is nine months, assuming all months have 30 days.) Some mail readers use a different name than junk, so this example may not apply unchanged everywhere. Maybe it will retain even spam for nine months.

The syntax used for the search is the same as for aox show search. (The show search command is a handy way to experiment with search syntax.)

Here's a command to show the policies in effect:

aox show retention Global: retain 90 days: Unconditional delete 120 days: Unconditional /archive: retain 1827 days: Unconditional /users: retain 270 days: All must be true: Not: Message has flag: "\\Seen" Not: Message has flag: "junk"

A retain policy always wins over delete policies, so the 120-day deletion policy does not always take effect at once. Unread nonspam in e.g. /users' inboxes is kept for at least 270 days, but will be deleted on the 271th day, since it's then older than 120 days and no longer has to be retained.

If mail is deleted when it shouldn't be, you can pull it back from the grave for a while using aox undelete. Note that aox vacuum will probably delete the mail again the next time it is run (unless you change the retention policy).

Spam

Spam is a major problem for message retention. Our general advice is to block it before it enters the database.

If spam is permitted to enter the database, then it will be subject to the same retention policies as valuable mail. You can either weaken those policies so that users can delete messages if they first mark them as spam, or accept that your archive will be cluttered with spam.

The example above tries to find a middle ground: If a mail reader marks a message using the junk flag, or if the user reads it, then it's deleted after 90 days. If it remains unread and is not marked using that flag, then it's deleted after 270 days. If any spam is stored in an /archive/… mailbox, then it's retained for five years, full stop.

Even this middle ground is far from satisfying. Having to archive spam pleases noone. Having a retention policy that says …but you can delete anything if you declare it to be spam isn't too impressive, either. Therefore, we think that if you wish to use rule-based retention, then you'll be well served by a spam filter that rejects spam instead of filing it into special folders.

How it works

The aox retain mail command writes a row into the retention_policies table. When a user tries to delete mail, Archiveopteryx consults that table and if any messages must still be retained, Archiveopteryx does so. The rest of the user's request is carried out.

To an IMAP user, it will seem as if some other user cleared the Deleted flag on the messages at the last instant. To a POP user, it will seem as if the client leaves (some) mail on the server.

When you (or crontab) run aox vacuum, aox looks at both retain and delete, and deletes the appropriate messages. (All the policies are agglutenated into a single SQL query, so if your retention policies are complex, this can result in the mother of all SQL queries.)

The system administrator and DBA can override the policies in several ways.

aox delete user refuses to delete a user who still has mail. However, if you use the -f switch it will delete the user's mail, even mail which should be retained. aox undelete can recover the mail for a few days.

It is possible to delete from the mailbox_messages table using direct SQL queries, if you have permission. The Archiveopteryx server does not have permission to do that.

The system administrator can change the retention policies, delete mail, and then change the policies back.

Ordinary users cannot override the policies in any way.

Future development

At some point we want to allow deleting duplicated messages. For example, if a message is present in four mailboxes, we want to allow deleting it in three, but not in the last. This is mostly a cosmetic feature, since the message is stored only once, even when it is visible in several mailboxes.

We also want to extend aox undelete so that when you undelete automatically deleted mail, it explains which retention policy is to blame, and whether the message will be deleted again.

At present it isn't possible to delete retention policies. That seems like an omission.

Perhaps we should make aox show retention show the exact aox retain mail and aox delete mail incocations needed to create the current policies.

In case of questions, please write to info@aox.org.

Relevant links

About this page

Last modified: 2010-11-19
Location: aox.org/retention/