Struct
GLibUri
since: 2.66
Description [src]
struct GUri {
/* No available fields */
}
The GUri
type and related functions can be used to parse URIs into
their components, and build valid URIs from individual components.
Since GUri
only represents absolute URIs, all GUri
s will have a
URI scheme, so g_uri_get_scheme()
will always return a non-NULL
answer. Likewise, by definition, all URIs have a path component, so
g_uri_get_path()
will always return a non-NULL
string (which may
be empty).
If the URI string has an
‘authority’ component (that
is, if the scheme is followed by ://
rather than just :
), then the
GUri
will contain a hostname, and possibly a port and ‘userinfo’.
Additionally, depending on how the GUri
was constructed/parsed (for example,
using the G_URI_FLAGS_HAS_PASSWORD
and G_URI_FLAGS_HAS_AUTH_PARAMS
flags),
the userinfo may be split out into a username, password, and
additional authorization-related parameters.
Normally, the components of a GUri
will have all %
-encoded
characters decoded. However, if you construct/parse a GUri
with
G_URI_FLAGS_ENCODED
, then the %
-encoding will be preserved instead in
the userinfo, path, and query fields (and in the host field if also
created with G_URI_FLAGS_NON_DNS
). In particular, this is necessary if
the URI may contain binary data or non-UTF-8 text, or if decoding
the components might change the interpretation of the URI.
For example, with the encoded flag:
g_autoptr(GUri) uri = g_uri_parse ("http://host/path?query=http%3A%2F%2Fhost%2Fpath%3Fparam%3Dvalue", G_URI_FLAGS_ENCODED, &err);
g_assert_cmpstr (g_uri_get_query (uri), ==, "query=http%3A%2F%2Fhost%2Fpath%3Fparam%3Dvalue");
While the default %
-decoding behaviour would give:
g_autoptr(GUri) uri = g_uri_parse ("http://host/path?query=http%3A%2F%2Fhost%2Fpath%3Fparam%3Dvalue", G_URI_FLAGS_NONE, &err);
g_assert_cmpstr (g_uri_get_query (uri), ==, "query=http://host/path?param=value");
During decoding, if an invalid UTF-8 string is encountered, parsing will fail with an error indicating the bad string location:
g_autoptr(GUri) uri = g_uri_parse ("http://host/path?query=http%3A%2F%2Fhost%2Fpath%3Fbad%3D%00alue", G_URI_FLAGS_NONE, &err);
g_assert_error (err, G_URI_ERROR, G_URI_ERROR_BAD_QUERY);
You should pass G_URI_FLAGS_ENCODED
or G_URI_FLAGS_ENCODED_QUERY
if you
need to handle that case manually. In particular, if the query string
contains =
characters that are %
-encoded, you should let
g_uri_parse_params()
do the decoding once of the query.
GUri
is immutable once constructed, and can safely be accessed from
multiple threads. Its reference counting is atomic.
Note that the scope of GUri
is to help manipulate URIs in various applications,
following RFC 3986. In particular,
it doesn’t intend to cover web browser needs, and doesn’t implement the
WHATWG URL standard. No APIs are provided to
help prevent
homograph attacks, so
GUri
is not suitable for formatting URIs for display to the user for making
security-sensitive decisions.
Relative and absolute URIs
As defined in RFC 3986, the hierarchical nature of URIs means that they can either be ‘relative references’ (sometimes referred to as ‘relative URIs’) or ‘URIs’ (for clarity, ‘URIs’ are referred to in this documentation as ‘absolute URIs’ — although in contrast to RFC 3986, fragment identifiers are always allowed).
Relative references have one or more components of the URI missing. In
particular, they have no scheme. Any other component, such as hostname,
query, etc. may be missing, apart from a path, which has to be specified (but
may be empty). The path may be relative, starting with ./
rather than /
.
For example, a valid relative reference is ./path?query
,
/?query#fragment
or //example.com
.
Absolute URIs have a scheme specified. Any other components of the URI which
are missing are specified as explicitly unset in the URI, rather than being
resolved relative to a base URI using g_uri_parse_relative()
.
For example, a valid absolute URI is file:///home/bob
or
https://search.com?query=string
.
A GUri
instance is always an absolute URI. A string may be an absolute URI
or a relative reference; see the documentation for individual functions as to
what forms they accept.
Parsing URIs
The most minimalist APIs for parsing URIs are g_uri_split()
and
g_uri_split_with_user()
. These split a URI into its component
parts, and return the parts; the difference between the two is that
g_uri_split()
treats the ‘userinfo’ component of the URI as a
single element, while g_uri_split_with_user()
can (depending on the
GUriFlags
you pass) treat it as containing a username, password,
and authentication parameters. Alternatively, g_uri_split_network()
can be used when you are only interested in the components that are
needed to initiate a network connection to the service (scheme,
host, and port).
g_uri_parse()
is similar to g_uri_split()
, but instead of
returning individual strings, it returns a GUri
structure (and it requires
that the URI be an absolute URI).
g_uri_resolve_relative()
and g_uri_parse_relative()
allow
you to resolve a relative URI relative to a base URI.
g_uri_resolve_relative()
takes two strings and returns a string,
and g_uri_parse_relative()
takes a GUri
and a string and returns a
GUri
.
All of the parsing functions take a GUriFlags
argument describing
exactly how to parse the URI; see the documentation for that type
for more details on the specific flags that you can pass. If you
need to choose different flags based on the type of URI, you can
use g_uri_peek_scheme()
on the URI string to check the scheme
first, and use that to decide what flags to parse it with.
For example, you might want to use G_URI_PARAMS_WWW_FORM
when parsing the
params for a web URI, so compare the result of g_uri_peek_scheme()
against http
and https
.
Building URIs
g_uri_join()
and g_uri_join_with_user()
can be used to construct
valid URI strings from a set of component strings. They are the
inverse of g_uri_split()
and g_uri_split_with_user()
.
Similarly, g_uri_build()
and g_uri_build_with_user()
can be
used to construct a GUri
from a set of component strings.
As with the parsing functions, the building functions take a
GUriFlags
argument. In particular, it is important to keep in mind
whether the URI components you are using are already %
-encoded. If so,
you must pass the G_URI_FLAGS_ENCODED
flag.
file://
URIs
Note that Windows and Unix both define special rules for parsing
file://
URIs (involving non-UTF-8 character sets on Unix, and the
interpretation of path separators on Windows). GUri
does not
implement these rules. Use g_filename_from_uri()
and
g_filename_to_uri()
if you want to properly convert between
file://
URIs and local filenames.
URI Equality
Note that there is no g_uri_equal ()
function, because comparing
URIs usefully requires scheme-specific knowledge that GUri
does
not have. GUri
can help with normalization if you use the various
encoded GUriFlags
as well as G_URI_FLAGS_SCHEME_NORMALIZE
however it is not comprehensive.
For example, data:,foo
and data:;base64,Zm9v
resolve to the same
thing according to the data:
URI specification which GLib does not handle.
Available since: 2.66
Functions
g_uri_build_with_user
Creates a new GUri
from the given components according to flags
(G_URI_FLAGS_HAS_PASSWORD
is added unconditionally). The flags
must be
coherent with the passed values, in particular use %
-encoded values with
G_URI_FLAGS_ENCODED
.
since: 2.66
g_uri_is_valid
Parses uri_string
according to flags
, to determine whether it is a valid
absolute URI, i.e. it does not need to be resolved
relative to another URI using g_uri_parse_relative().
since: 2.66
g_uri_join
Joins the given components together according to flags
to create
an absolute URI string. path
may not be NULL
(though it may be the empty string).
since: 2.66
g_uri_join_with_user
Joins the given components together according to flags
to create
an absolute URI string. path
may not be NULL
(though it may be the empty string).
since: 2.66
g_uri_list_extract_uris
Splits an URI list conforming to the text/uri-list mime type defined in RFC 2483 into individual URIs, discarding any comments. The URIs are not validated.
since: 2.6
g_uri_parse
Parses uri_string
according to flags
. If the result is not a
valid absolute URI, it will be discarded, and an
error returned.
since: 2.66
g_uri_parse_params
Many URI schemes include one or more attribute/value pairs as part of the URI
value. This method can be used to parse them into a hash table. When an
attribute has multiple occurrences, the last value is the final returned
value. If you need to handle repeated attributes differently, use
GUriParamsIter
.
since: 2.66
g_uri_parse_scheme
Gets the scheme portion of a URI string. RFC 3986 decodes the scheme as:
URI = scheme ":" hier-part [ "?" query ] [ "#" fragment ]
Common schemes include file
, https
, svn+ssh
, etc.
since: 2.16
g_uri_peek_scheme
Gets the scheme portion of a URI string. RFC 3986 decodes the scheme as:
URI = scheme ":" hier-part [ "?" query ] [ "#" fragment ]
Common schemes include file
, https
, svn+ssh
, etc.
since: 2.66
g_uri_resolve_relative
Parses uri_ref
according to flags
and, if it is a
relative URI, resolves it relative to
base_uri_string
. If the result is not a valid absolute URI, it will be
discarded, and an error returned.
since: 2.66
g_uri_split
Parses uri_ref
(which can be an
absolute or relative URI) according to flags
, and
returns the pieces. Any component that doesn’t appear in uri_ref
will be
returned as NULL
(but note that all URIs always have a path component,
though it may be the empty string).
since: 2.66
g_uri_split_network
Parses uri_string
(which must be an absolute URI)
according to flags
, and returns the pieces relevant to connecting to a host.
See the documentation for g_uri_split()
for more details; this is
mostly a wrapper around that function with simpler arguments.
However, it will return an error if uri_string
is a relative URI,
or does not contain a hostname component.
since: 2.66
g_uri_split_with_user
Parses uri_ref
(which can be an
absolute or relative URI) according to flags
, and
returns the pieces. Any component that doesn’t appear in uri_ref
will be
returned as NULL
(but note that all URIs always have a path component,
though it may be the empty string).
since: 2.66
Instance methods
g_uri_get_auth_params
Gets uri
‘s authentication parameters, which may contain
%
-encoding, depending on the flags with which uri
was created.
(If uri
was not created with G_URI_FLAGS_HAS_AUTH_PARAMS
then this will
be NULL
.).
since: 2.66
g_uri_get_fragment
Gets uri
‘s fragment, which may contain %
-encoding, depending on
the flags with which uri
was created.
since: 2.66
g_uri_get_host
Gets uri
‘s host. This will never have %
-encoded characters,
unless it is non-UTF-8 (which can only be the case if uri
was
created with G_URI_FLAGS_NON_DNS
).
since: 2.66
g_uri_get_password
Gets uri
‘s password, which may contain %
-encoding, depending on
the flags with which uri
was created. (If uri
was not created
with G_URI_FLAGS_HAS_PASSWORD
then this will be NULL
.).
since: 2.66
g_uri_get_path
Gets uri
‘s path, which may contain %
-encoding, depending on the
flags with which uri
was created.
since: 2.66
g_uri_get_query
Gets uri
‘s query, which may contain %
-encoding, depending on the
flags with which uri
was created.
since: 2.66
g_uri_get_scheme
Gets uri
‘s scheme. Note that this will always be all-lowercase,
regardless of the string or strings that uri
was created from.
since: 2.66
g_uri_get_user
Gets the ‘username’ component of uri
‘s userinfo, which may contain
%
-encoding, depending on the flags with which uri
was created.
If uri
was not created with G_URI_FLAGS_HAS_PASSWORD
or
G_URI_FLAGS_HAS_AUTH_PARAMS
, this is the same as g_uri_get_userinfo().
since: 2.66
g_uri_get_userinfo
Gets uri
‘s userinfo, which may contain %
-encoding, depending on
the flags with which uri
was created.
since: 2.66
g_uri_parse_relative
Parses uri_ref
according to flags
and, if it is a
relative URI, resolves it relative to base_uri
.
If the result is not a valid absolute URI, it will be discarded, and an error returned.
since: 2.66
g_uri_to_string_partial
Returns a string representing uri
, subject to the options in
flags
. See g_uri_to_string()
and GUriHideFlags
for more details.
since: 2.66