The bodyparts table

bodyparts contains all the bodyparts currently or recently used in messages.

Archiveopteryx shares bodyparts among messages, so if a large attachment is sent to many people, that adds only one row in bodyparts. However there may be many (small) rows in part_numbers.

The text and/or data columns contain the actual data in the bodypart.

text contains the bodypart text, if the bodypart contains text. This is stored as UTF-8. There is no quoted-printable or other encoding.

data contains the bodypart data exactly as received, if the bodypart is not a text part. This is not searchable.

Either text or data can be null, but not both. If data is null, then the bodypart contains text. If text is null, then the bodypart contains data.

In some cases, both text and data are present. That happens when the bodypart is data, but Archiveopteryx was able to extract some text for searching. In this case, searches willl use the extracted text, but message retrieval uses the data.

The bytes column is an optimisation. It is not strictly necessary, but its presence makes some common queries faster. It contains the size of either text or data, whichever is used.

The hash column contains an MD5 hash of either text or data. It's used to tie a newly arrived message to an existing bodypart.

The hash column is indexed in order to help performance, but the index is not unique, so Archiveopteryx can store two different bodyparts even if their MD5 hashes are the same. (In April 2009, an Archiveopteryx user experienced a random, unprovoked MD5 collision.)

create table bodyparts ( -- Grant: select, insert id integer default nextval('bodypart_ids') primary key, bytes integer not null, hash text not null, text text, data bytea );

The bodyparts table was introduced in version 0.93.

In case of questions, please write to

Relevant links

About this page

Last modified: 2010-11-19