Function
GLibutf8_normalize
Declaration [src]
gchar*
g_utf8_normalize (
const gchar* str,
gssize len,
GNormalizeMode mode
)
Description [src]
Converts a string into canonical form, standardizing
such issues as whether a character with an accent
is represented as a base character and combining
accent or as a single precomposed character. The
string has to be valid UTF-8, otherwise NULL
is
returned. You should generally call g_utf8_normalize()
before comparing two Unicode strings.
The normalization mode G_NORMALIZE_DEFAULT
only
standardizes differences that do not affect the
text content, such as the above-mentioned accent
representation. G_NORMALIZE_ALL
also standardizes
the “compatibility” characters in Unicode, such
as SUPERSCRIPT THREE to the standard forms
(in this case DIGIT THREE). Formatting information
may be lost but for most text operations such
characters should be considered the same.
G_NORMALIZE_DEFAULT_COMPOSE
and G_NORMALIZE_ALL_COMPOSE
are like G_NORMALIZE_DEFAULT
and G_NORMALIZE_ALL
,
but returned a result with composed forms rather
than a maximally decomposed form. This is often
useful if you intend to convert the string to
a legacy encoding or pass it to a system with
less capable Unicode handling.
Parameters
str
-
Type:
const gchar*
A UTF-8 encoded string.
The data is owned by the caller of the function. The value is a NUL terminated UTF-8 string. len
-
Type:
gssize
Length of
str
, in bytes, or -1 ifstr
is nul-terminated. mode
-
Type:
GNormalizeMode
The type of normalization to perform.