Inherits Garbage
An email-oriented 8-bit string class.
The string data are counted, so null bytes are allowed, and most operations are very fast.
The data structure uses a simplified variant of reference counting, where only "one" and "many" are possible. The detach() function ensures that the count is "one" afterwards. Many functions leave the count on "many", even ones such as mid().
The usual string functions are implemented, along with a variety of email-specific operations such as eQP(), deQP(), needsQP(), e64(). boring() returns true if the string can be used unquoted in e.g. MIME, quoted() quotes it. upper() and lower() have a third sibling, headerCased(). simplified() and trimmed() remove white space in ways email often needs.
Creates a copy of s.
Creates a EString from the NUL-terminated string s. The NUL is not copied.
Creates a EString from the first n bytes of s, which may contain NULs.
Creates an empty EString
Reimplements Garbage::Garbage().
Returns a copy of this string where most/all content has been replaced with the letter 'x' or the digit '4', but if the message was an RFC 822 message, it keeps the same parse tree.
Specifically, most ASCII words are changed to xxxx, while most/all syntax elements are kept.
This function is very, very slow. That's okay since it's only used for sending bug reports to us, and we all know, that's not a common case.
This version of append() appends the single character c.
Appends other to this string.
This version of append() appends the null-terminated string s, or does nothing if s is null.
This version of append() appends num raw bytes from memory base. If base is null, this function does nothing.
Converts n to a number in the base system and appends the result to this string. If n is 0, "0" is appended.
Uses lower-case for digits above 9.
Returns true if this string is really boring, and false if it's empty or contains at least one character that may warrant quoting in some context. So far RFC 822 atoms, 2822 atoms, IMAP atoms and MIME tokens are considered.
This function considers the intersection of those character classes to be the Totally boring subset. If b is not its default value, it may include other characters.
Returns the capacity of the string variable, that is, how long the string can be before it has to allocate memory.
Returns -1 if this string is lexicographically before other, 0 if they are the same, and 1 if this string is lexicographically after other.
The comparison is case sensitive - just a byte comparison.
Returns true if this string contains at least one instance of s.
Returns true if this string contains at least one instance of c.
Returns true if this string contains at least one instance of s, and the characters before and after the occurence aren't letters.
Returns a copy of this string where every linefeed is CRLF, and where the last two characters are CRLF.
Returns the zero-terminated byte representation of the string. Note that even though the return value is zero-terminated, it can also contain null bytes in the middle.
Even though this function modifies memory, it doesn't detach(), since it doesn't modify the string. However, in most cases its call to reserve() causes a detach().
This const version of cstr() is the same as the non-const version above. The only difference is that it can be called on a const object, and that it may cause some memory allocation elsewhere.
Returns a pointer to the string's byte representation, which is NOT necessarily zero-terminated.
Decodes this string using the base-64 algorithm and returns the result.
Decodes this string according to the quoted-printable algorithm, and returns the result. Errors are overlooked, to cope with all the mail-munging brokenware in the great big world.
If underscore is true, underscores in the input are translated into spaces (as specified in RFC 2047).
Returns a version of this EString with every %xx escape replaced with the corresponding character (as used to encode URIs). Invalid escape sequences are left unchanged, so this function cannot be used for input from potentially malevolent sources.
An implementation of uudecode, sufficient to handle some occurences of "content-transfer-encoding: x-uuencode" seen. Possibly not correct according to POSIX 1003.2b, who knows.
Returns a e decoded version of this EString.
Ensures that the string is modifiable. All EString functions call this prior to modifying the string.
Encodes this string using the base-64 algorithm and returns the result in lines of at most lineLength characters. If lineLength is not supplied, e64() returns a single line devoid of whitespace.
Encodes this string using the quoted-printable algorithm and returns the encoded version. In the encoded version, all line feeds are CRLF, and soft line feeds are positioned so that the q-p looks as good as it can.
Note that this function is slightly incompatible with RFC 2646: It encodes trailing spaces, as suggested in RFC 2045, but RFC 2646 suggest that if trailing spaces are the only reason to q-p, then the message should not be encoded.
If underscore is present and true, this function uses the variant of q-p specified by RFC 2047, where a space is encoded as an underscore and a few more characters need to be encoded.
If from is present and true, this function also makes sure that no output line starts with "From " or looks like a MIME boundary.
Returns a version of this EString with absolutely nothing changed. (This function is eventually intended to percent-escape URIs, the opposite of deURI().)
Returns an e encoded version of this EString. If e is Base64, then n specifies the maximum line length. The default is 0, i.e. no limit.
This function does not support Uuencode. If e is Uuencode, it returns the input string.
Returns true if this string ends with suffix, and false if it does not.
Returns true if this string ends with suffix, and false if it does not.
Returns the position of the first occurence of c on or after i in this string, or -1 if there is none.
Returns the position of the first occurence of s on or after i in this string, or -1 if there is none.
Returns a string representing the number n in the base system, which is 10 (decimal) by default and must be in the range 2-36.
For 0, "0" is returned.
For bases 11-36, lower-case letters are used for the digits beyond 9.
Returns a copy of this string where all letters have been changed to conform to typical mail header practice: Letters following digits and other letters are lower-cased. Other letters are upper-cased (notably including the very first character).
Returns the lowercase-hexadecimal representation of the string.
Returns n as a string representing that number in a human-readable fashion optionally suffixed by K, M, G or T.
The number is rounded more or less correctly.
Returns true is the string is quoted with c (default '"') as quote character and q (default '\') as escape character. c and q may be the same.
Returns the length of the string. The length does not include any terminator or padding.
Returns a copy of this string where all upper-case letters (A-Z - this is ASCII only) have been changed to lower case.
Returns a string containing the data starting at position start of this string, extending for num bytes. num may be left out, in which case the rest of the string is returned.
If start is too large, an empty string is returned.
This function returns true if the string would need to be encoded using quoted-printable. It is a greatly simplified copy of eQP(), with the changes made necessary by RFC 2646.
Returns the number encoded by this string, and sets *ok to true if that number is valid, or to false if the number is invalid. By default the number is encoded in base 10, if base is specified that base is used. base must be at least 2 and at most 36.
If the number is invalid (e.g. negative), number() returns 0.
If ok is a null pointer, it is not modified.
Copies other to this string and returns a reference to this string.
Copies s to this string and returns a reference to this string. If s is a null pointer, the result is an empty string.
Deletes p. (This function exists only so that gcc -O3 doesn't decide that EString objects don't need destruction.)
Prepends other to this string.
This function is a debugging aid. It prints the contents of the string within single quotes followed by a trailing newline to stderr.
Returns a version of this string quited with c, and where any occurences of c or q are escaped with q.
Replaces all occurences of a in this string with b. Rather slow and allocates much memory. Could be optimised if it ever shows up on the performance graphs.
a must not be empty.
Replaced sections are not considered when looking for the next match.
Ensures that there is at least num bytes available in this string. This implicitly causes the string to become modifiable and have a nonzero number of available bytes.
After calling reserve(), capacity() is at least as large as num, while length() has not changed.
Equivalent to reserve(). reserve( num ) calls this function to do the heavy lifting. This function is not inline, while reserve() is, and calls to this function should be interesting wrt. memory allocation statistics.
Noone except reserve() should call reserve2().
Returns section n of this string, where a section is defined as a run of sequences separated by s. If s is the empty string or n is 0, section() returns this entire string. If this string contains fewer instances of s than n (ie. section n is after the end of the string), section returns an empty string.
Ensures that the string's length is l. If l is 0, the string will be empty after the function is called. If l is longer than the string used to be, the new part is uninitialised.
Returns a copy of this string where each run of whitespace is compressed to a single ASCII 32, and where leading and trailing whitespace is removed altogether.
Returns true if this string starts with prefix, and false if it does not.
Returns true if this string starts with prefix, and false if it does not.
Returns a copy of this EString with at most one trailing LF or CRLF removed. If there's more than one LF or CRLF, the remainder are left.
Returns a copy of this string where leading and trailing whitespace have been removed.
Ensures that the string's length is either l or length(), whichever is smaller. If l is 0 (the default), the string will be empty after the function is called.
Returns the unquoted representation of the string if it isQuoted() and the string itself else.
c at the start and end are removed; any occurence of c within the string is left alone; an occurence of q followed by c is converted into just c.
Returns a copy of this string where all lower-case letters (a-z - this is ASCII only) have been changed to upper case.
Returns a copy of this string wrapped so that each line contains at most linelength characters. The first line is prefixed by firstPrefix, subsequent lines by otherPrefix. If spaceAtEOL is true, all lines except the last end with a space.
The prefixes are counted towards line length, but the optional trailing space is not.
Only space (ASCII 32) is a line-break opportunity. If there are multiple spaces where a line is broken, all the spaces are replaced by a single CRLF. Linefeeds added use CRLF.
Destroys the string.
Because EString is used so much, and can eat up such vast amounts of memory so quickly, this destructor does something: If the string is the sole owner of its data, it frees them.
As of April 2005, the return values of data() or cstr() are NO LONGER valid after a string has gone out of scope or otherwise been lost.
This web page based on source code belonging to The Archiveopteryx Developers. All rights reserved.