Inherits Garbage
The UString class provides a normalized Unicode string.
Unicode is the common character encoding for all strings except those limited to US-ASCII, but such strings are sparingly manipulated.
Most of the functionality of UString is concerned with conversion to/from other encodings, such as ISO-8859-15, KOI-U, etc, etc. Other functionality is intentionally kept to a minimum, to lighten the testing burden.
Two functions note particular mention are ascii() and the equality operator. ascii() returns something that's useful for logging, but which can often not be converted back to unicode.
There is a fast equality operator which tests against printable ASCII, returning false for every unprintable or non-ASCII character. Very useful for comparing a UString to e.g. "seen" or ".", but nothing more.
Constructs an exact copy of other.
Constructs an empty Unicode EString.
Reimplements Garbage::Garbage().
Appends other to the end of this string.
Appends the ASCII character sequences s to the end of this string.
Appends unicode code point cp to the end of this string.
Returns a copy of this string in 7-bit ASCII. Any characters that aren't printable ascii are changed into '?'. (Is '?' the right choice?)
This looks like AsciiCodec::fromUnicode(), but is semantically different. This function is for logging and debugging and may leave out a different set of characters than does AsciiCodec::fromUnicode().
Returns -1 if this string is lexicographically before other, 0 if they are the same, and 1 if this string is lexicographically after other.
The comparison is case sensitive - just a codepoint comparison. It does not sort the way humans expect.
Returns true if this string contains at least one instance of s.
Returns true if this string contains at least one instance of c.
Returns true if this string contains at least one instance of s.
Returns true if this string ends with suffix, and false if it does not.
Returns true if this string ends with suffix, and false if it does not. suffix must be an ASCII or 8859-1 string.
Returns the position of the first occurence of c on or after i in this string, or -1 if there is none.
Returns the position of the first occurence of s on or after i in this string, or -1 if there is none.
Returns true if this string contains only printable tab, cr, lf and ASCII characters, and false if it contains one or more other characters.
Returns true if c is a digit, and false if not.
Returns true if c is a letter, and false if not.
Returns true if c is a unicode space character, and false if not.
Returns a string containing the data starting at position start of this string, extending for num bytes. num may be left out, in which case the rest of the string is returned.
If start is too large, an empty string is returned.
Returns the number encoded by this string, and sets *ok to true if that number is valid, or to false if the number is invalid. By default the number is encoded in base 10, if base is specified that base is used. base must be at least 2 and at most 36.
If the number is invalid (e.g. negative), the return value is undefined.
If ok is a null pointer, it is not modified.
Appends other to this string and returns a reference to this strng.
Makes this string into an exact copy of other and returns a reference to this strng.
Deletes p. (This function exists only so that gcc -O3 doesn't decide that UString objects don't need destruction.)
Ensures that at least num characters are available for this string. Users of UString should generally not need to call this; it is called by append() etc. as needed.
Equivalent to reserve(). reserve( num ) calls this function to do the heavy lifting. This function is not inline, while reserve() is, and calls to this function should be interesting wrt. memory allocation statistics.
Noone except reserve() should call reserve2().
Returns a copy of this string where each run of whitespace is compressed to a single space character, and where leading and trailing whitespace is removed altogether. Most spaces are mapped to U+0020, but the Ogham space dominates and ZWNBSP recedes.
Unicode space characters are as listed in http://en.wikipedia.org/wiki/Space_character
Returns true if this string starts with prefix, and false if it does not.
Returns true if this string starts with prefix, and false if it does not. prefix must be an ASCII or 8859-1 string.
Returns a titlecased version of this string. Usable for case-insensitive comparison, not much else.
Returns a copy of this string without leading or trailing whitespace.
Truncates this string to l characters. If the string is shorter, truncate() does nothing. If l is 0 (the default), the string will be empty after this function is called.
Returns an UTF8-encoded version of this UString. The string is null-terminated for easy debugging, but remember that it may also contain embedded nulls.
Destroys the string. Doesn't free anything.
This web page based on source code belonging to The Archiveopteryx Developers. All rights reserved.