Class EmailParser.

The EmailParser class provides parser help for RFC 822-like grammars. It properly is more like a lexer than a parser, but also not quite like a lexer.

EmailParser provides a cursor, and member functions to read many RFC 2822 productions at the cursor. Generally, each member returns the production read or an empty string.

EmailParser::EmailParser( const EString & s )

Creates a new RFC 822 parser object to parse s.

Reimplements AbnfParser::AbnfParser().

EString EmailParser::atom()

Returns a single atom, stepping past white space and comments before and after it.

int EmailParser::cfws()

Returns the number of CFWS characters at the cursor, but does nothing else.

EString EmailParser::comment()

Moves pos() past all comments and surrounding white space, and returns the contents of the last comment.

Returns a null string if there was no comment.

static UString EmailParser::de2047( const EString & s )

Do RFC 2047 decoding of s, totally ignoring what the encoded-text in s contains.

Depending on circumstances, the encoded-text may contain different sets of characters. Moreover, not every 2047 encoder obeys the rules. This function checks nothing, it just decodes.

EString EmailParser::domain()

Returns a single domain and steps past it.

This isn't correct at the moment, but I think it will eventually be...

Note that our definition of domain differs from the RFC 822 one. We only accept three forms: Something that may be a DNS A record, something that may be an IPv4 address in '[]' and something that may be an IPv6 address, again in '[]'. Examples: 'lupinella.troll.no', '[213.203.59.59]' and '[IPv6:::ffff:213.203.59.59]'.

EString EmailParser::dotAtom()

Returns a dot-atom, stepping past all relevant whitespace and comments.

UString EmailParser::encodedWord( EncodedText type )

Steps past a MIME encoded-word (as defined in RFC 2047) and returns its decoded unicode representation, or an empty string if the cursor does not point to a valid encoded-word. The caller is responsible for checking that the encoded-word is separated from neighbouring tokens by whitespace.

The characters permitted in the encoded-text are adjusted based on type, which may be Text (by default), Comment, or Phrase.

UString EmailParser::encodedWords( EncodedText t )

Steps past a sequence of adjacent encoded-words with whitespace in between and returns the decoded representation. t passed through to encodedWord().

Leading and trailing whitespace is trimmed, internal whitespace is kept as is.

bool EmailParser::isAtext( char c ) const

Returns true if c belongs to the RFC 2822 'atext' production, and false in all other circumstances.

bool EmailParser::isMime() const

Returns true if this parser has been instructed to parse MIME strings by calling setMime(), and false otherwise.

EString EmailParser::lastComment() const

Returns the last comment seen so far by this parser, or a null string if none has been seen yet.

EString EmailParser::mimeToken()

Returns a single MIME token (as defined in RFC 2045 section 5), which is an atom minus [/?=] plus [.].

EString EmailParser::mimeValue()

Returns a single MIME value (as defined in RFC 2045 section 5), which is an atom minus [/?=] plus [.] (i.e., a MIME token) or a quoted string.

uint EmailParser::number()

Skips past whitespace, parses a decimal number and returns that number.

Reimplements AbnfParser::number().

UString EmailParser::phrase()

Steps past an RFC 822 phrase (a series of word/encoded-words) at the cursor and returns its unicode representation, which may be an empty string.

void EmailParser::setMime( bool m )

Sets this EmailParser object to parse MIME strings if m is true, and RFC 2822 strings if m is false. The only difference is the definition of specials.

EString EmailParser::string()

Steps past an atom or a quoted-text, and returns that text.

UString EmailParser::text()

Steps past the longest "*text" (a series of text/encoded-words) at the cursor and returns its unicode representation, which may be an empty string.

bool EmailParser::valid()

Returns true if this parser has not yet encountered any errors during parsing, and false otherwise.

UString EmailParser::whitespace()

Moves pos() to the first nonwhitespace character after the current point. If pos() points to nonwhitespace already, it is not moved.