The C++ String class can be used to represent character strings and perform operation upon them. Many operations are defined for normal operators but may have different semantics. E.g. the + operator is used to concatenate (add) strings together.
Substrings may be selected from strings to create other strings. Also, substrings have lvalue semantics, meaning a string may be assigned to a substring expression.
Other constructors are provided to initialize strings from character pointers, single characters or other strings. These constuctors also take care of typecasting:
String x; // Contruct an empty string String abc("abc"); // Convert a char * to a string String p('p'); // A string with one character in it.
Another string (or string expression), a char* or a single character can be assigned to a string:
String x, y; x = "abc"; y = x; x = 'p';
All relational operators (==, != , <, <=, > and >=) are available to compare strings alphabethically. Note that the comparison is done in the local character set, which will be ASCII in most cases. This also means that case is significant.
String can be added together, i.e. concatenated, with the + operator. Of course, the += operator can be used as well. Example:
String x, y, z; x = "abc" y = "def" z = x + y; // z = "abcdef" y += x; // y = "defabc"
The length of a string can be determined with the ~ operator. Do not confuse this with the bit-wise inversion operator !
String x = "Hello"; int l = ~x; // l = 5
The test for an empty string is done with the ! operator. Note that there is no test for a non-empty string.
if (!x) cout << "String x is empty\n";
Access to individual characters is provided like in ordinary C strings. The [] operator selects an individual character for both read and write access:
String x = "Hello"; char c = x[1]; // c = 'e' x[2] = 'L'; // x = "HeLlo"
A substring is a part of a string denoted by its start index and its length. The start index is counted from zero. To extract a substring from a string use:
String x = "abcdefghijkl"; String y = x(3,5); // y = "defgh"
A substring expression can also be used as an lvalue. This means that a string can be assigned to a substring. The length of the substring does not need to be equal to the length of the string that is assigned to it.
String x = "abcdefghijkl"; String y = "12345678"; x(3,5) = y; // x = "abc12345678ijkl"
Note that assigning a string to a zero-length substring will simply insert the a string into another string. Reversely, assigning an empty string to a substring will remove the characters from the original string.
String x = "abcdefghijkl"; String y = "12345678"; x(3,0) = y; // x = "abc12345678defghijkl" x(11,5) = ""; // x = "abc12345678ijkl"
This property can be used to truncate a string to its first n characters by:
x(n, ~x-n) = "";
Strings are coverted to and from numerical types (integer or floating point) by constructors and conversion operators. A numerical value is converted to a string with the number constructor:
String n(12); // n = "12" String x(2.32); // x = "2.32"
These functions use a default format to convert the number to a string. Specifically, an integer is converted with the "%d" format from printf and a floating point is converted with the "%g" format.
A string is converted to a number with the type-casting operators: operator long() and operator double(). The conversion to a long integer recognizes C syntax for numerical literals. Numbers starting with '0' are regarded as octal and numbers starting with '0x' are regarded as hexadecimal.
Special member functions allow conversion of strings to and from numbers in a specific format. The functions dec(), oct() and hex() convert the string using decimal, octal and hexadecimal, respectively.
Shifting a string moves all characters in the string a number of places to the left or to the right. Characters on the other end of the string simply fall off. Unlike shifting a binary number, shifting a string does not pad the string. Rather, shifting a string n places either to the left or to the right, renders the string n characters shorter. Example:
"abcdefgh" << 3 = "defgh" "abcdefgh" >> 3 = "abcde"
The shift operators << and >> can be combined with assignment: <<= and >>=. Do not confuse these shift operators with stream I/O operators.
String numbers("2,4,7"); numbers.split(","); // "2" "4" "7"Splitting an empty string will result in a SuperString with an empty string as its only element. The separator can be a multiple character string, for example:
String somexml("A&B&C"); somexml.split("&"); // "A" "B" "C"Consecutive separators are not grouped together and are deemed to delimit empty strings, for example:
String numbers("3.14 42"); // Note 3 spaces numbers.split(" "); // "3.14" "" "" "42"This also happens with separators at the start and the end of the string.
String example("aaa...bbb"); example.tokenize("."); // "aaa" "bbb"
A string can be written to a stream like any other type with the << operator. The left hand side of the expression must be a stream. To read a string from a stream, use the >> operator, with a stream on the left hand side:
String x; cin >> x; cout << x;
Note that reading a string from a istream behaves like reading a character array from an istream. I.e., characters are read from the stream until the first whitespace character.
Regular expressions are handled by objects of class regex
.
Constructors allow the creation of regex objects from String
objects,
literal strings or other regex objects.
String pattern("[a-z]+"); regex word(pattern); regex number("[0-9]+"); regex nr(number);
Regular expressions are primarily used to find patterns in text strings. The == operator performs the pattern matching. A relational expression returns true if the String matches the pattern, i.e. the pattern matches some part of the text in the String object.
regex nr("[0-9]+"); String x("abcdef""); String y("abc123def"); x == nr; // false nr == y; // true
A regular expression can be used in a substring expression. The substring selected from the target string is the first part that matches the regular expression.
regex nr("[0-9]+"); String y("abc123def456ghi"); x = y(nr); // x = 123
A SuperString is a sequence of Strings, just like a String is a sequence of characters. A SuperString is in fact a string of Strings. Many operations that can be applied to String objects are also available for SuperSting objects, for example realtional operators, adding to SuperString objects, access to individual characters and 'substring' expressions. Split and join operations convert Strings into SuperStrings and vice versa. In that sense, split and join are complementary operations. The split() method of a string creates a SuperString. Conversily, the join() method of a SuperString creates a String.
The default constructor creates an empty superstring, one that does not contain any strings. A superstring constructed from a string creates a superstring with one element. Other contructors can create a SuperString from STL collectors (list and vector) of strings.
Another SuperString, String or SuperString expression can be assigned to a SuperString:
SuperString x, y; String z("abc"); x = z; // Create a SuperString with one element y = x;
The length of a SuperString can be determined with the ~ operator. This is the number of Strings in the SuperString. Do not confuse this with the bit-wise inversion operator !
SuperString xx(5); int l = ~xx; // l = 5
The test for an empty string is done with the booloperator. This operator returns true if the SuperString object contains 1 or more Strings. It does not mean, however, that the String objects inside the SuperSting are not empty.
if (!x) cout << "String x is empty\n";
Access to individual String elements is provided like in ordinary arrays. The [] operator selects an individual String for both read and write access:
SuperString xx(3); String s = "Hello"; xx[2] = s;
Access out of range will throw a std::out_of_range exception.
SuperString objects can be added together, i.e. concatenated, with the + operator. Of course, the += operator can be used as well. Example:
SuperString x, y, z; SuperString x("abc"); SuperString y("def"); z = x + y; // z = [ "abc", "def" ] y += z; // y = [ "def", "abc", "def" ]
split on a regex ? how to split on an empty string ? split and join for csv formats. quoting and escape , ? the + operater can add strings and superstrings to append or prepend the one to the other. the result us always a superstring. add characters and strings ? -> TODO stream io is not possible. the split and join functions must be used instead the substring expression creates a selection of strings of the superstring just like the substring expression of a string creates a selection of characters. a substring expression with regex argument selects all strings that match the expression. implements a sort of grep function. this means the substring may not be a consequitive subset of the superstring. assigning to a substring repleces all strings in the substring convert to and from a std vector or list
The date class encapsulates the concept of a date, represented by days, months and years. It holds a day and a month, both between 0 and 255, and a year ranging from -32768 to +32767. Although full-range dates can be usefull in calulations, a real date should have a day between 1 and 31 inclusive, and a month between 1 and 12. Only such dates can be used as proper calender dates. A different kind of date may be used as a relative date to perform calculations. For example, to add 2 years, 3 months and 6 days to a date, a date like (6, 3, 2) can be created. All calculations with dates are based on the Julian calendar.
Parsing text into a date object is a challenging task. Partly because there are many ways to write dates but also because some ways to write a date are ambiguous. For example, the date 8-2-12 could mean August 2nd 2012, 12th of February 2008 or 8th of February 2012. These are some ways to write February 8th 2012 that are supported by the parser:
The relational operators are defined with obvious meanings:
leap()
returns 0.
hour UTC
The hour class encapsulates a time in hours, minutes and seconds. Most operators are defined to provide calculations just like integer numbers.
Objects of class hour can be constructed and assigned from integer numbers, String objects or other hour objects.
11:45:30 + 1:20:40 = 13:06:10
11:05:30 - 1:20:40 = 9:44:50
A UTC object can be constructed from its parts or by parsing from a String.
String()
time_t() - Convert to the UNIX time_t.
The Integer
class provides multiple precision integer arithmetic
facilities. Some representation details are discussed in the
Representation section.
Integers
may be up to b * ((1 << b) - 1)
bits long, where
b
is the number of bits per short (typically 1048560 bits when
b = 16
). The implementation assumes that a long
is at least
twice as long as a short
. This assumption hides beneath almost all
primitive operations, and would be very difficult to change. It also relies
on correct behavior of unsigned arithmetic operations.
Some of the arithmetic algorithms are very loosely based on those provided in the MIT Scheme `bignum.c' release, which is Copyright (c) 1987 Massachusetts Institute of Technology. Their use here falls within the provisions described in the Scheme release.
Integers may be constructed in the following ways:
Integer
back into longs via the long
coercion operator. If the Integer cannot fit into a long, this returns
MINLONG or MAXLONG (depending on the sign) where MINLONG is the most
negative, and MAXLONG is the most positive representable long.
Integer
is < MAXLONG
and > MINLONG
.
Integer
to a double
, with potential
loss of precision.
+/-HUGE
is returned if the Integer cannot fit into a double.
Integer
can fit into a double.
All of the usual arithmetic operators are provided (+, -, *, /,
%, +=, ++, -=, --, *=, /=, %=, ==, !=, <, <=, >, >=
). All operators
support special versions for mixed arguments of Integers and regular
C++ longs in order to avoid useless coercions, as well as to allow
automatic promotion of shorts and ints to longs, so that they may be
applied without additional Integer coercion operators. The only
operators that behave differently than the corresponding int or long
operators are ++
and --
. Because C++ does not
distinguish prefix from postfix application, these are declared as
void
operators, so that no confusion can result from applying
them as postfix. Thus, for Integers x and y, ++x; y = x;
is
correct, but y = ++x;
and y = x++;
are not.
Bitwise operators (~
, &
, |
, ^
, <<
,
>>
, &=
, |=
, ^=
, <<=
, >>=
) are
also provided. However, these operate on sign-magnitude, rather than
two's complement representations. The sign of the result is arbitrarily
taken as the sign of the first argument. For example, Integer(-3)
& Integer(5)
returns Integer(-1)
, not -3, as it would using
two's complement. Also, ~
, the complement operator, complements
only those bits needed for the representation. Bit operators are also
provided in the BitSet and BitString classes. One of these classes
should be used instead of Integers when the results of bit manipulations
are not interpreted numerically.
The following utility functions are also provided. (All arguments are Integers unless otherwise noted).
this
in place.
if (sign(x) == 0)
is a generally faster method
of testing for zero than using relational operators.
(*this)
as a base base
number, in field width at least width
.
The Integer class in the Gnu libg++ manual.
The collection of xml classes provide an easy way to process XML documents and to traverse the tree of XML data. Built upon the xml2 library on xmlsoft.org, the classes encapsulate concepts of XML data. After parsing an XML document with an xml object, the xml_node class can access the data of the XML document tree. A node in the tree can have any number of children which are also nodes. The child nodes are accessed by using the subscript operator with a numerical index. Elements of the XML data can be selected by name when a String argument is used with the subscript operator. Using this operator will return a vector of xml_element objects that hold all child elements having this name. xml data(filename); xml_node root(data); xml_node third = root[2]; String tagname("chapter"); std::vector <xml_element> all_chapters; all_chapters = root[tagname];
An object of class xml holds an entire XML document. It supports reading and writing XML documents from and to files as well conversion of XML objects to and from String objects.
A node is the basic building block that makes up the tree of an XML document. There are several kinds of nodes, such as ekements, attributes and text nodes. If a node is an ekement node it can have any number if child nodes. These child nodes can be accessed by using the subscript operator with a numerical index.
The class xml_element is derived from the class xml_node and adds properties specific to XML elements.
Handle configurational parameters for the application. Many applications need some permanently stored configurational data. The information is usually stored in two places: A system- wide configuration file and a configuration file per user. The content of the configuration file is in XML format. The configuration base class takes care of finding the configuration files, e.g. in /etc/app.conf or in /usr/local/etc/app.conf, along with the configuration file in the user's home directory. The config files are parsed with the gnome XML parser and a framework is provided to find configurational items.
Hold the configuration data for an application. Both the system-wide configuration and a user-specific configuration are stores in a configuration object.