Your Position: Join Wing -> How to: Write source code for wing -> Coding Standard -> String functions

String functions

C90 provides many string functions. However we may need to use non-standard functions. For example, stricmp (which is strcasecmp under linux). Here we provided our own standard of string functions which respect C standard as much as possible while maintaining the ability to use non-standard functions across platforms.

1. General functions

Character functions


// Convert character to uppercase.
int toupper(int c);
int towuppert(wint_t c);

// Convert character to lowercase.
int tolower(int c);
int towlower(wint_t c);

// Determines if a particular character is in uppercase.
int isupper(int c);
int iswupper(wint_t c);

// Determines if a particular character is in lowercase.
int islower(int c);
int iswlower(wint_t c);

// Check an integer to see it is represents an alphabetic character 
// in a character set.
int isalpha(int c);
int iswalpha(wint_t c);

// Determines if a particular character is a decimal-digit character.
int isdigit(int c);
int iswdigit(wint_t c);

// Checks for a hexadecimal digits. 
int isxdigit(int c);
int iswxdigit(wint_t c);

// Determines if a particular character is a number.
int isalnum(int c);
int iswalnum(wint_t c);

// Checks for a blank character; that is, a space or a tab.
int isblank(int c);
int iswblank(wint_t c);

// Checks for white-space characters. In the "C" and "POSIX" locales, 
// these are: space, form-feed ('\f'), newline ('\n'), carriage 
// return ('\r'), horizontal tab ('\t'), and vertical tab ('\v'). 
int isspace(int c);
int iswspace(wint_t c);

// Checks for any printable character except space.
int isgraph(int c);
int iswgraph(wint_t c);

// Checks for any printable character including space.
int isprint(int c);
int iswprint(wint_t c);

// Checks for any printable character which is not a space or an 
// alphanumeric character. 
int ispunct(int c);
int iswpunct(wint_t c);

String functions


// Get the length of a string.
size_t strlen(const char* str);
size_t wcslen(const wchar_t* str);

// Get the length of a string with a maximal limit.
size_t strnlen(const char* str, size_t sizeInBytes);
size_t wcsnlen(const char* str, size_t sizeInWchars);

// Copy a string.
char* strcpy( char* strDestination, const char* strSource );
wchar_t* wcscpy( wchar_t* strDestination, const wchar_t* strSource );

// Copy a string with a maximal limit.
char* strncpy(char* strDest, const char* strSource, size_t count);
char* wcsncpy(char* strDest, const char* strSource, size_t count);

// Append a string.
char* strcat(char* strDestination, const char* strSource);
wchar_t* wcscat(wchar_t* strDestination, const wchar_t* strSource);

// Append a string with a maximal limit.
char* strncat(char* strDest, const char* strSource, size_t count);
wchar_t* wcsncat(wchar_t* strDest, const wchar_t* strSource, size_t count);

// Compare strings.
int strcmp(const char *string1, const char *string2);
int wcscmp(const wchar_t *string1, const wchar_t *string2);

// Compare strings.
int strncmp(const char *string1, const char *string2, size_t count);
int wcsncmp(const wchar_t *string1, const wchar_t *string2, size_t count);

Non-standard extension:


// Perform a lowercase comparison of strings.
int stricmp(const char* string1, const char* string2);
int wcsicmp(const wchar_t* string1, const wchar_t* string2);

// Perform a lowercase comparison of strings.
int strnicmp(const char *string1, const char *string2, size_t count);
int wcsnicmp(const wchar_t *string1, const wchar_t *string2, size_t count);

2. String print functions


// Writes formatted data to a string.
int snprintf(char *str, size_t size, const char *format, ...);
int snwprintf(wchar_t *buffer, size_t count, const wchar_t *format, ...);

// Write formatted output using a pointer to a list of arguments.
int vsnprintf(
    char *buffer, 
    size_t count, 
    const char *format, 
    va_list argptr);
int vsnwprintf(
    wchar_t *buffer, 
    size_t count, 
    const wchar_t *format, 
    va_list argptr);

Notice: Don't use sprintf, vsprintf because they are not safe. Don't use swprintf, vswprintf because they may not be provided by your compiler.

3. Print string format specification

The string format in snprintf series uses the following format. We discarded all platform-specific formats and left only those commonly supported by most compilers.

%[flags] [width] [.precision] [length modifier]type

3.1 flags

The value should be converted to an ''alternate form''. For o conversions, the first character of the output string is made zero (by prefixing a 0 if it was not zero already). For x and X conversions, a non-zero result has the string '0x' (or '0X' for X conversions) prepended to it. For a, A, e, E, f, F, g, and G conversions, the result will always contain a decimal point, even if no digits follow it (normally, a decimal point appears in the results of those conversions only if a digit follows). For g and G conversions, trailing zeros are not removed from the result as they would otherwise be. For other conversions, the result is undefined.

The value should be zero padded. For d, i, o, u, x, X, a, A, e, E, f, F, g, and G conversions, the converted value is padded on the left with zeros rather than blanks. If the 0 and - flags both appear, the 0 flag is ignored. If a precision is given with a numeric conversion (d, i, o, u, x, and X), the 0 flag is ignored. For other conversions, the behavior is undefined.

The converted value is to be left adjusted on the field boundary. (The default is right justification.) Except for n conversions, the converted value is padded on the right with blanks, rather than on the left with blanks or zeros. A - overrides a 0 if both are given.

' '

(a space) A blank should be left before a positive number (or empty string) produced by a signed conversion.

A sign (+ or -) should always be placed before a number produced by a signed conversion. By default a sign is used only for negative numbers. A + overrides a space if both are used.

3.2 width

An optional nonnegative decimal digit string (with non-zero first digit) specifying a minimum field width. If the converted value has fewer characters than the field width, it will be padded with spaces on the left (or right, if the left-adjustment flag has been given).

If the width specification is an asterisk (*), an int argument from the argument list supplies the value. The width argument must precede the value being formatted in the argument list.

In no case does a non-existent or small field width cause truncation of a field; if the result of a conversion is wider than the field width, the field is expanded to contain the conversion result.

3.3 precision

The third optional field of the format specification is the precision specification. It specifies a nonnegative decimal integer, preceded by a period (.), which specifies the number of characters to be printed, the number of decimal places, or the number of significant digits. Unlike the width specification, the precision specification can cause either truncation of the output value or rounding of a floating-point value. If precision is specified as 0 and the value to be converted is 0, the result is no characters output.

If the precision specification is an asterisk (*), an int argument from the argument list supplies the value. The precision argument must precede the value being formatted in the argument list.

3.4 length modifier

A following integer conversion corresponds to a short int or unsigned short int argument, or a following n conversion corresponds to a pointer to a short int argument.

(ell) A following integer conversion corresponds to a long int or unsigned long int argument, or a following n conversion corresponds to a pointer to a long int argument, or a following c conversion corresponds to a wint_t argument, or a following s conversion corresponds to a pointer to wchar_t argument.

(ell-ell). A following integer conversion corresponds to a long long int or unsigned long long int argument, or a following n conversion corresponds to a pointer to a long long int argument.

3.5 type

d,i

The int argument is converted to signed decimal notation. The precision, if any, gives the minimum number of digits that must appear; if the converted value requires fewer digits, it is padded on the left with zeros. The default precision is 1. When 0 is printed with an explicit precision 0, the output is empty.

o,u,x,X

The unsigned int argument is converted to unsigned octal (o), unsigned decimal (u), or unsigned hexadecimal (x and X) notation. The letters abcdef are used for x conversions; the letters ABCDEF are used for X conversions. The precision, if any, gives the minimum number of digits that must appear; if the converted value requires fewer digits, it is padded on the left with zeros. The default precision is 1. When 0 is printed with an explicit precision 0, the output is empty.

e,E

The double argument is rounded and converted in the style [-]d.dddde[sign]dd[d] where there is one digit before the decimal-point character and the number of digits after it is equal to the precision; if the precision is missing, it is taken as 6; if the precision is zero, no decimal-point character appears. An E conversion uses the letter E (rather than e) to introduce the exponent. The exponent always contains at least two digits; if the value is zero, the exponent is 00.

The double argument is rounded and converted to decimal notation in the style [-]ddd.ddd, where the number of digits after the decimal-point character is equal to the precision specification. If the precision is missing, it is taken as 6; if the precision is explicitly zero, no decimal-point character appears. If a decimal point appears, at least one digit appears before it.

g,G

The double argument is converted in style f or e (or F or E for G conversions). The precision specifies the number of significant digits. If the precision is missing, 6 digits are given; if the precision is zero, it is treated as 1. Style e is used if the exponent from its conversion is less than -4 or greater than or equal to the precision. Trailing zeros are removed from the fractional part of the result; a decimal point appears only if it is followed by at least one digit.

a,A

For a conversion, the double argument is converted to hexadecimal notation (using the letters abcdef) in the style [-]0xh.hhhhp[sign]dd; for A conversion the prefix 0X, the letters ABCDEF, and the exponent separator P is used. There is one hexadecimal digit before the decimal point, and the number of digits after it is equal to the precision. The default precision suffices for an exact representation of the value if an exact representation in base 2 exists and otherwise is sufficiently large to distinguish values of type double. The digit before the decimal point is unspecified for non-normalized numbers, and non-zero but otherwise unspecified for normalized numbers.

If no l modifier is present, the int argument is converted to an unsigned char, and the resulting character is written. If an l modifier is present, the wint_t (wide character) argument is converted to a multibyte sequence by a call to the wcrtomb() function, with a conversion state starting in the initial state, and the resulting multibyte string is written.

If no l modifier is present: The const char * argument is expected to be a pointer to an array of character type (pointer to a string). Characters from the array are written up to (but not including) a terminating null byte ('\0'); if a precision is specified, no more than the number specified are written. If a precision is given, no null byte need be present; if the precision is not specified, or is greater than the size of the array, the array must contain a terminating null byte.

If an l modifier is present: The const wchar_t * argument is expected to be a pointer to an array of wide characters. Wide characters from the array are converted to multibyte characters (each by a call to the wcrtomb() function, with a conversion state starting in the initial state before the first wide character), up to and including a terminating null wide character. The resulting multibyte characters are written up to (but not including) the terminating null byte. If a precision is specified, no more bytes than the number specified are written, but no partial multibyte characters are written. Note that the precision determines the number of bytes written, not the number of wide characters or screen positions. The array must contain a terminating null wide character, unless a precision is given and it is so small that the number of bytes written exceeds it before the end of the array is reached.

The void * pointer argument is printed in hexadecimal (as if by %#x or %#lx).

The number of characters written so far is stored into the integer indicated by the int * (or variant) pointer argument. No argument is converted.

4. Frequently used formats

%
Input: N/A
Formats: "%%"

character
Input: int
Formats: "%c"

wide character
Input: wint_t
Formats: "%lc"

multi-byte string
Input: const char*
Formats: "%s"

unicode string
Input: const char*
Formats: "%ls"

signed short integer
Input: short
Formats: "%hd", "%hi"

un-signed short integer
Input: unsigned short
Formats(Decimal): "%hu"
Example Output: "15"
Formats(Oct): "%ho", "%#ho"
Example Output: "17", "017"
Formats(Hex): "%hx", "%hX", "%#hx", "%#hX"
Example Output: "f", "F", "0xf", "0xF"

signed integer
Input: int
Formats: "%d", "%i"

un-signed integer
Input: unsigned int
Formats(Decimal): "%u"
Example Output: "15"
Formats(Oct): "%o", "%#o"
Example Output: "17", "017"
Formats(Hex): "%x", "%X", "%#x", "%#X"
Example Output: "f", "F", "0xf", "0xF"

signed long integer
Input: long
Formats: "%ld", "%li"

un-signed long integer
Input: unsigned long
Formats(Decimal): "%lu"
Example Output: "15"
Formats(Oct): "%lo", "%#lo"
Example Output: "17", "017"
Formats(Hex): "%lx", "%lX", "%#lx", "%#lX"
Example Output: "f", "F", "0xf", "0xF"

signed long long integer
Input: long long, __int64
Formats: "%lld", "%lli"

un-signed long long integer
Input: unsigned long long, unsigned __int64
Formats(Decimal): "%llu"
Example Output: "15"
Formats(Oct): "%llo", "%#llo"
Example Output: "17", "017"
Formats(Hex): "%llx", "%llX", "%#llx", "%#llX"
Example Output: "f", "F", "0xf", "0xF"

Double precision float number
Input: double
Formats(Decimal): "%f" "%.3f" "%.03f"
Example Output: "3.1415926535897", "3.142", "3.142"
Formats(Scientific): "%e", "%E"
Formats(Dicimal or Scientific): "%g", "%G"
Formats(Hex): "%a", "%A"

Pointer
Input: void*
Formats(Hex): "%p"

If you have any questions, please send a mail to renqilin at users.sourceforge.net