Class EmailParser.

Inherits AbnfParser

The EmailParser class provides parser help for RFC 822-like grammars. It properly is more like a lexer than a parser, but also not quite like a lexer.

EmailParser provides a cursor, and member functions to read many RFC 2822 productions at the cursor. Generally, each member returns the production read or an empty string.

EmailParser::EmailParser( const EString & s )

Creates a new RFC 822 parser object to parse s.

Reimplements AbnfParser::AbnfParser().

EString EmailParser::atom()

Returns a single atom, stepping past white space and comments before and after it.

int EmailParser::cfws()

Returns the number of CFWS characters at the cursor, but does nothing else.

EString EmailParser::comment()

Moves pos() past all comments and surrounding white space, and returns the contents of the last comment.

Returns a null string if there was no comment.

static UString EmailParser::de2047( const EString & s )

Do RFC 2047 decoding of s, totally ignoring what the encoded-text in s contains.

Depending on circumstances, the encoded-text may contain different sets of characters. Moreover, not every 2047 encoder obeys the rules. This function checks nothing, it just decodes.

EString EmailParser::domain()

Returns a single domain and steps past it.

This isn't correct at the moment, but I think it will eventually be...

Note that our definition of domain differs from the RFC 822 one. We only accept three forms: Something that may be a DNS A record, something that may be an IPv4 address in '[]' and something that may be an IPv6 address, again in '[]'. Examples: 'lupinella.troll.no', '[213.203.59.59]' and '[IPv6:::ffff:213.203.59.59]'.

EString EmailParser::dotAtom()

Returns a dot-atom, stepping past all relevant whitespace and comments.

UString EmailParser::encodedWord( EncodedText type )

Steps past a MIME encoded-word (as defined in RFC 2047) and returns its decoded unicode representation, or an empty string if the cursor does not point to a valid encoded-word. The caller is responsible for checking that the encoded-word is separated from neighbouring tokens by whitespace.

The characters permitted in the encoded-text are adjusted based on type, which may be Text (by default), Comment, or Phrase.

UString EmailParser::encodedWords( EncodedText t )

Steps past a sequence of adjacent encoded-words with whitespace in between and returns the decoded representation. t passed through to encodedWord().

Leading and trailing whitespace is trimmed, internal whitespace is kept as is.

bool EmailParser::isAtext( char c ) const

Returns true if c belongs to the RFC 2822 'atext' production, and false in all other circumstances.

bool EmailParser::isMime() const

Returns true if this parser has been instructed to parse MIME strings by calling setMime(), and false otherwise.

EString EmailParser::lastComment() const

Returns the last comment seen so far by this parser, or a null string if none has been seen yet.

EString EmailParser::mimeToken()

Returns a single MIME token (as defined in RFC 2045 section 5), which is an atom minus [/?=] plus [.].

EString EmailParser::mimeValue()

Returns a single MIME value (as defined in RFC 2045 section 5), which is an atom minus [/?=] plus [.] (i.e., a MIME token) or a quoted string.

uint EmailParser::number()

Skips past whitespace, parses a decimal number and returns that number.

Reimplements AbnfParser::number().

UString EmailParser::phrase()

Steps past an RFC 822 phrase (a series of word/encoded-words) at the cursor and returns its unicode representation, which may be an empty string.

void EmailParser::setMime( bool m )

Sets this EmailParser object to parse MIME strings if m is true, and RFC 2822 strings if m is false. The only difference is the definition of specials.

EString EmailParser::string()

Steps past an atom or a quoted-text, and returns that text.

UString EmailParser::text()

Steps past the longest "*text" (a series of text/encoded-words) at the cursor and returns its unicode representation, which may be an empty string.

bool EmailParser::valid()

Returns true if this parser has not yet encountered any errors during parsing, and false otherwise.

UString EmailParser::whitespace()

Moves pos() to the first nonwhitespace character after the current point. If pos() points to nonwhitespace already, it is not moved.

This web page based on source code belonging to The Archiveopteryx Developers. All rights reserved.