251 lines
7.8 KiB
Plaintext
251 lines
7.8 KiB
Plaintext
@node Character Handling, String and Array Utilities, Memory Allocation, Top
|
|
@chapter Character Handling
|
|
|
|
Programs that work with characters and strings often need to classify a
|
|
character---is it alphabetic, is it a digit, is it whitespace, and so
|
|
on---and perform case conversion operations on characters. The
|
|
functions in the header file @file{ctype.h} are provided for this
|
|
purpose.
|
|
@pindex ctype.h
|
|
|
|
Since the choice of locale and character set can alter the
|
|
classifications of particular character codes, all of these functions
|
|
are affected by the current locale. (More precisely, they are affected
|
|
by the locale currently selected for character classification---the
|
|
@code{LC_CTYPE} category; see @ref{Locale Categories}.)
|
|
|
|
@menu
|
|
* Classification of Characters:: Testing whether characters are
|
|
letters, digits, punctuation, etc.
|
|
|
|
* Case Conversion:: Case mapping, and the like.
|
|
@end menu
|
|
|
|
@node Classification of Characters, Case Conversion, , Character Handling
|
|
@section Classification of Characters
|
|
@cindex character testing
|
|
@cindex classification of characters
|
|
@cindex predicates on characters
|
|
@cindex character predicates
|
|
|
|
This section explains the library functions for classifying characters.
|
|
For example, @code{isalpha} is the function to test for an alphabetic
|
|
character. It takes one argument, the character to test, and returns a
|
|
nonzero integer if the character is alphabetic, and zero otherwise. You
|
|
would use it like this:
|
|
|
|
@smallexample
|
|
if (isalpha (c))
|
|
printf ("The character `%c' is alphabetic.\n", c);
|
|
@end smallexample
|
|
|
|
Each of the functions in this section tests for membership in a
|
|
particular class of characters; each has a name starting with @samp{is}.
|
|
Each of them takes one argument, which is a character to test, and
|
|
returns an @code{int} which is treated as a boolean value. The
|
|
character argument is passed as an @code{int}, and it may be the
|
|
constant value @code{EOF} instead of a real character.
|
|
|
|
The attributes of any given character can vary between locales.
|
|
@xref{Locales}, for more information on locales.@refill
|
|
|
|
These functions are declared in the header file @file{ctype.h}.
|
|
@pindex ctype.h
|
|
|
|
@cindex lower-case character
|
|
@comment ctype.h
|
|
@comment ANSI
|
|
@deftypefun int islower (int @var{c})
|
|
Returns true if @var{c} is a lower-case letter.
|
|
@end deftypefun
|
|
|
|
@cindex upper-case character
|
|
@comment ctype.h
|
|
@comment ANSI
|
|
@deftypefun int isupper (int @var{c})
|
|
Returns true if @var{c} is an upper-case letter.
|
|
@end deftypefun
|
|
|
|
@cindex alphabetic character
|
|
@comment ctype.h
|
|
@comment ANSI
|
|
@deftypefun int isalpha (int @var{c})
|
|
Returns true if @var{c} is an alphabetic character (a letter). If
|
|
@code{islower} or @code{isupper} is true of a character, then
|
|
@code{isalpha} is also true.
|
|
|
|
In some locales, there may be additional characters for which
|
|
@code{isalpha} is true--letters which are neither upper case nor lower
|
|
case. But in the standard @code{"C"} locale, there are no such
|
|
additional characters.
|
|
@end deftypefun
|
|
|
|
@cindex digit character
|
|
@cindex decimal digit character
|
|
@comment ctype.h
|
|
@comment ANSI
|
|
@deftypefun int isdigit (int @var{c})
|
|
Returns true if @var{c} is a decimal digit (@samp{0} through @samp{9}).
|
|
@end deftypefun
|
|
|
|
@cindex alphanumeric character
|
|
@comment ctype.h
|
|
@comment ANSI
|
|
@deftypefun int isalnum (int @var{c})
|
|
Returns true if @var{c} is an alphanumeric character (a letter or
|
|
number); in other words, if either @code{isalpha} or @code{isdigit} is
|
|
true of a character, then @code{isalnum} is also true.
|
|
@end deftypefun
|
|
|
|
@cindex hexadecimal digit character
|
|
@comment ctype.h
|
|
@comment ANSI
|
|
@deftypefun int isxdigit (int @var{c})
|
|
Returns true if @var{c} is a hexadecimal digit.
|
|
Hexadecimal digits include the normal decimal digits @samp{0} through
|
|
@samp{9} and the letters @samp{A} through @samp{F} and
|
|
@samp{a} through @samp{f}.
|
|
@end deftypefun
|
|
|
|
@cindex punctuation character
|
|
@comment ctype.h
|
|
@comment ANSI
|
|
@deftypefun int ispunct (int @var{c})
|
|
Returns true if @var{c} is a punctuation character.
|
|
This means any printing character that is not alphanumeric or a space
|
|
character.
|
|
@end deftypefun
|
|
|
|
@cindex whitespace character
|
|
@comment ctype.h
|
|
@comment ANSI
|
|
@deftypefun int isspace (int @var{c})
|
|
Returns true if @var{c} is a @dfn{whitespace} character. In the standard
|
|
@code{"C"} locale, @code{isspace} returns true for only the standard
|
|
whitespace characters:
|
|
|
|
@table @code
|
|
@item ' '
|
|
space
|
|
|
|
@item '\f'
|
|
formfeed
|
|
|
|
@item '\n'
|
|
newline
|
|
|
|
@item '\r'
|
|
carriage return
|
|
|
|
@item '\t'
|
|
horizontal tab
|
|
|
|
@item '\v'
|
|
vertical tab
|
|
@end table
|
|
@end deftypefun
|
|
|
|
@cindex blank character
|
|
@comment ctype.h
|
|
@comment GNU
|
|
@deftypefun int isblank (int @var{c})
|
|
Returns true if @var{c} is a blank character; that is, a space or a tab.
|
|
This function is a GNU extension.
|
|
@end deftypefun
|
|
|
|
@cindex graphic character
|
|
@comment ctype.h
|
|
@comment ANSI
|
|
@deftypefun int isgraph (int @var{c})
|
|
Returns true if @var{c} is a graphic character; that is, a character
|
|
that has a glyph associated with it. The whitespace characters are not
|
|
considered graphic.
|
|
@end deftypefun
|
|
|
|
@cindex printing character
|
|
@comment ctype.h
|
|
@comment ANSI
|
|
@deftypefun int isprint (int @var{c})
|
|
Returns true if @var{c} is a printing character. Printing characters
|
|
include all the graphic characters, plus the space (@samp{ }) character.
|
|
@end deftypefun
|
|
|
|
@cindex control character
|
|
@comment ctype.h
|
|
@comment ANSI
|
|
@deftypefun int iscntrl (int @var{c})
|
|
Returns true if @var{c} is a control character (that is, a character that
|
|
is not a printing character).
|
|
@end deftypefun
|
|
|
|
@cindex ASCII character
|
|
@comment ctype.h
|
|
@comment SVID, BSD
|
|
@deftypefun int isascii (int @var{c})
|
|
Returns true if @var{c} is a 7-bit @code{unsigned char} value that fits
|
|
into the US/UK ASCII character set. This function is a BSD extension
|
|
and is also an SVID extension.
|
|
@end deftypefun
|
|
|
|
@node Case Conversion, , Classification of Characters, Character Handling
|
|
@section Case Conversion
|
|
@cindex character case conversion
|
|
@cindex case conversion of characters
|
|
@cindex converting case of characters
|
|
|
|
This section explains the library functions for performing conversions
|
|
such as case mappings on characters. For example, @code{toupper}
|
|
converts any character to upper case if possible. If the character
|
|
can't be converted, @code{toupper} returns it unchanged.
|
|
|
|
These functions take one argument of type @code{int}, which is the
|
|
character to convert, and return the converted character as an
|
|
@code{int}. If the conversion is not applicable to the argument given,
|
|
the argument is returned unchanged.
|
|
|
|
@strong{Compatibility Note:} In pre-ANSI C dialects, instead of
|
|
returning the argument unchanged, these functions may fail when the
|
|
argument is not suitable for the conversion. Thus for portability, you
|
|
may need to write @code{islower(c) ? toupper(c) : c} rather than just
|
|
@code{toupper(c)}.
|
|
|
|
These functions are declared in the header file @file{ctype.h}.
|
|
@pindex ctype.h
|
|
|
|
@comment ctype.h
|
|
@comment ANSI
|
|
@deftypefun int tolower (int @var{c})
|
|
If @var{c} is an upper-case letter, @code{tolower} returns the corresponding
|
|
lower-case letter. If @var{c} is not an upper-case letter,
|
|
@var{c} is returned unchanged.
|
|
@end deftypefun
|
|
|
|
@comment ctype.h
|
|
@comment ANSI
|
|
@deftypefun int toupper (int @var{c})
|
|
If @var{c} is a lower-case letter, @code{tolower} returns the corresponding
|
|
upper-case letter. Otherwise @var{c} is returned unchanged.
|
|
@end deftypefun
|
|
|
|
@comment ctype.h
|
|
@comment SVID, BSD
|
|
@deftypefun int toascii (int @var{c})
|
|
This function converts @var{c} to a 7-bit @code{unsigned char} value
|
|
that fits into the US/UK ASCII character set, by clearing the high-order
|
|
bits. This function is a BSD extension and is also an SVID extension.
|
|
@end deftypefun
|
|
|
|
@comment ctype.h
|
|
@comment SVID
|
|
@deftypefun int _tolower (int @var{c})
|
|
This is identical to @code{tolower}, and is provided for compatibility
|
|
with the SVID. @xref{SVID}.@refill
|
|
@end deftypefun
|
|
|
|
@comment ctype.h
|
|
@comment SVID
|
|
@deftypefun int _toupper (int @var{c})
|
|
This is identical to @code{toupper}, and is provided for compatibility
|
|
with the SVID.
|
|
@end deftypefun
|