Struct

GLibUri

since: 2.66

Description

struct GUri {
  /* No available fields */
}

The GUri type and related functions can be used to parse URIs into their components, and build valid URIs from individual components.

Since GUri only represents absolute URIs, all GUris will have a URI scheme, so g_uri_get_scheme() will always return a non-NULL answer. Likewise, by definition, all URIs have a path component, so g_uri_get_path() will always return a non-NULL string (which may be empty).

If the URI string has an ‘authority’ component (that is, if the scheme is followed by :// rather than just :), then the GUri will contain a hostname, and possibly a port and ‘userinfo’. Additionally, depending on how the GUri was constructed/parsed (for example, using the G_URI_FLAGS_HAS_PASSWORD and G_URI_FLAGS_HAS_AUTH_PARAMS flags), the userinfo may be split out into a username, password, and additional authorization-related parameters.

Normally, the components of a GUri will have all %-encoded characters decoded. However, if you construct/parse a GUri with G_URI_FLAGS_ENCODED, then the %-encoding will be preserved instead in the userinfo, path, and query fields (and in the host field if also created with G_URI_FLAGS_NON_DNS). In particular, this is necessary if the URI may contain binary data or non-UTF-8 text, or if decoding the components might change the interpretation of the URI.

For example, with the encoded flag:

g_autoptr(GUri) uri = g_uri_parse ("http://host/path?query=http%3A%2F%2Fhost%2Fpath%3Fparam%3Dvalue", G_URI_FLAGS_ENCODED, &err);
g_assert_cmpstr (g_uri_get_query (uri), ==, "query=http%3A%2F%2Fhost%2Fpath%3Fparam%3Dvalue");

While the default %-decoding behaviour would give:

g_autoptr(GUri) uri = g_uri_parse ("http://host/path?query=http%3A%2F%2Fhost%2Fpath%3Fparam%3Dvalue", G_URI_FLAGS_NONE, &err);
g_assert_cmpstr (g_uri_get_query (uri), ==, "query=http://host/path?param=value");

During decoding, if an invalid UTF-8 string is encountered, parsing will fail with an error indicating the bad string location:

g_autoptr(GUri) uri = g_uri_parse ("http://host/path?query=http%3A%2F%2Fhost%2Fpath%3Fbad%3D%00alue", G_URI_FLAGS_NONE, &err);
g_assert_error (err, G_URI_ERROR, G_URI_ERROR_BAD_QUERY);

You should pass G_URI_FLAGS_ENCODED or G_URI_FLAGS_ENCODED_QUERY if you need to handle that case manually. In particular, if the query string contains = characters that are %-encoded, you should let g_uri_parse_params() do the decoding once of the query.

GUri is immutable once constructed, and can safely be accessed from multiple threads. Its reference counting is atomic.

Note that the scope of GUri is to help manipulate URIs in various applications, following RFC 3986. In particular, it doesn’t intend to cover web browser needs, and doesn’t implement the WHATWG URL standard. No APIs are provided to help prevent homograph attacks, so GUri is not suitable for formatting URIs for display to the user for making security-sensitive decisions.

Relative and absolute URIs

As defined in RFC 3986, the hierarchical nature of URIs means that they can either be ‘relative references’ (sometimes referred to as ‘relative URIs’) or ‘URIs’ (for clarity, ‘URIs’ are referred to in this documentation as ‘absolute URIs’ — although in contrast to RFC 3986, fragment identifiers are always allowed).

Relative references have one or more components of the URI missing. In particular, they have no scheme. Any other component, such as hostname, query, etc. may be missing, apart from a path, which has to be specified (but may be empty). The path may be relative, starting with ./ rather than /.

For example, a valid relative reference is ./path?query, /?query#fragment or //example.com.

Absolute URIs have a scheme specified. Any other components of the URI which are missing are specified as explicitly unset in the URI, rather than being resolved relative to a base URI using g_uri_parse_relative().

For example, a valid absolute URI is file:///home/bob or https://search.com?query=string.

A GUri instance is always an absolute URI. A string may be an absolute URI or a relative reference; see the documentation for individual functions as to what forms they accept.

Parsing URIs

The most minimalist APIs for parsing URIs are g_uri_split() and g_uri_split_with_user(). These split a URI into its component parts, and return the parts; the difference between the two is that g_uri_split() treats the ‘userinfo’ component of the URI as a single element, while g_uri_split_with_user() can (depending on the GUriFlags you pass) treat it as containing a username, password, and authentication parameters. Alternatively, g_uri_split_network() can be used when you are only interested in the components that are needed to initiate a network connection to the service (scheme, host, and port).

g_uri_parse() is similar to g_uri_split(), but instead of returning individual strings, it returns a GUri structure (and it requires that the URI be an absolute URI).

g_uri_resolve_relative() and g_uri_parse_relative() allow you to resolve a relative URI relative to a base URI. g_uri_resolve_relative() takes two strings and returns a string, and g_uri_parse_relative() takes a GUri and a string and returns a GUri.

All of the parsing functions take a GUriFlags argument describing exactly how to parse the URI; see the documentation for that type for more details on the specific flags that you can pass. If you need to choose different flags based on the type of URI, you can use g_uri_peek_scheme() on the URI string to check the scheme first, and use that to decide what flags to parse it with.

For example, you might want to use G_URI_PARAMS_WWW_FORM when parsing the params for a web URI, so compare the result of g_uri_peek_scheme() against http and https.

Building URIs

g_uri_join() and g_uri_join_with_user() can be used to construct valid URI strings from a set of component strings. They are the inverse of g_uri_split() and g_uri_split_with_user().

Similarly, g_uri_build() and g_uri_build_with_user() can be used to construct a GUri from a set of component strings.

As with the parsing functions, the building functions take a GUriFlags argument. In particular, it is important to keep in mind whether the URI components you are using are already %-encoded. If so, you must pass the G_URI_FLAGS_ENCODED flag.

file:// URIs

Note that Windows and Unix both define special rules for parsing file:// URIs (involving non-UTF-8 character sets on Unix, and the interpretation of path separators on Windows). GUri does not implement these rules. Use g_filename_from_uri() and g_filename_to_uri() if you want to properly convert between file:// URIs and local filenames.

URI Equality

Note that there is no g_uri_equal () function, because comparing URIs usefully requires scheme-specific knowledge that GUri does not have. GUri can help with normalization if you use the various encoded GUriFlags as well as G_URI_FLAGS_SCHEME_NORMALIZE however it is not comprehensive. For example, data:,foo and data:;base64,Zm9v resolve to the same thing according to the data: URI specification which GLib does not handle.

Available since: 2.66

Functions

g_uri_build

Creates a new GUri from the given components according to flags.

since: 2.66

g_uri_build_with_user

Creates a new GUri from the given components according to flags (G_URI_FLAGS_HAS_PASSWORD is added unconditionally). The flags must be coherent with the passed values, in particular use %-encoded values with G_URI_FLAGS_ENCODED.

since: 2.66

g_uri_error_quark
No description available.

g_uri_escape_bytes

Escapes arbitrary data for use in a URI.

since: 2.66

g_uri_escape_string

Escapes a string for use in a URI.

since: 2.16

g_uri_is_valid

Parses uri_string according to flags, to determine whether it is a valid absolute URI, i.e. it does not need to be resolved relative to another URI using g_uri_parse_relative().

since: 2.66

g_uri_join

Joins the given components together according to flags to create an absolute URI string. path may not be NULL (though it may be the empty string).

since: 2.66

g_uri_join_with_user

Joins the given components together according to flags to create an absolute URI string. path may not be NULL (though it may be the empty string).

since: 2.66

g_uri_list_extract_uris

Splits an URI list conforming to the text/uri-list mime type defined in RFC 2483 into individual URIs, discarding any comments. The URIs are not validated.

since: 2.6

g_uri_parse

Parses uri_string according to flags. If the result is not a valid absolute URI, it will be discarded, and an error returned.

since: 2.66

g_uri_parse_params

Many URI schemes include one or more attribute/value pairs as part of the URI value. This method can be used to parse them into a hash table. When an attribute has multiple occurrences, the last value is the final returned value. If you need to handle repeated attributes differently, use GUriParamsIter.

since: 2.66

g_uri_parse_scheme

Gets the scheme portion of a URI string. RFC 3986 decodes the scheme as:

URI = scheme ":" hier-part [ "?" query ] [ "#" fragment ]

Common schemes include file, https, svn+ssh, etc.

since: 2.16

g_uri_peek_scheme

Gets the scheme portion of a URI string. RFC 3986 decodes the scheme as:

URI = scheme ":" hier-part [ "?" query ] [ "#" fragment ]

Common schemes include file, https, svn+ssh, etc.

since: 2.66

g_uri_resolve_relative

Parses uri_ref according to flags and, if it is a relative URI, resolves it relative to base_uri_string. If the result is not a valid absolute URI, it will be discarded, and an error returned.

since: 2.66

g_uri_split

Parses uri_ref (which can be an absolute or relative URI) according to flags, and returns the pieces. Any component that doesn’t appear in uri_ref will be returned as NULL (but note that all URIs always have a path component, though it may be the empty string).

since: 2.66

g_uri_split_network

Parses uri_string (which must be an absolute URI) according to flags, and returns the pieces relevant to connecting to a host. See the documentation for g_uri_split() for more details; this is mostly a wrapper around that function with simpler arguments. However, it will return an error if uri_string is a relative URI, or does not contain a hostname component.

since: 2.66

g_uri_split_with_user

Parses uri_ref (which can be an absolute or relative URI) according to flags, and returns the pieces. Any component that doesn’t appear in uri_ref will be returned as NULL (but note that all URIs always have a path component, though it may be the empty string).

since: 2.66

g_uri_unescape_bytes

Unescapes a segment of an escaped string as binary data.

since: 2.66

g_uri_unescape_segment

Unescapes a segment of an escaped string.

since: 2.16

g_uri_unescape_string

Unescapes a whole escaped string.

since: 2.16

Instance methods

g_uri_get_auth_params

Gets uris authentication parameters, which may contain %-encoding, depending on the flags with which uri was created. (If uri was not created with G_URI_FLAGS_HAS_AUTH_PARAMS then this will be NULL.)

since: 2.66

g_uri_get_flags

Gets uris flags set upon construction.

since: 2.66

g_uri_get_fragment

Gets uris fragment, which may contain %-encoding, depending on the flags with which uri was created.

since: 2.66

g_uri_get_host

Gets uris host. This will never have %-encoded characters, unless it is non-UTF-8 (which can only be the case if uri was created with G_URI_FLAGS_NON_DNS).

since: 2.66

g_uri_get_password

Gets uris password, which may contain %-encoding, depending on the flags with which uri was created. (If uri was not created with G_URI_FLAGS_HAS_PASSWORD then this will be NULL.)

since: 2.66

g_uri_get_path

Gets uris path, which may contain %-encoding, depending on the flags with which uri was created.

since: 2.66

g_uri_get_port

Gets uris port.

since: 2.66

g_uri_get_query

Gets uris query, which may contain %-encoding, depending on the flags with which uri was created.

since: 2.66

g_uri_get_scheme

Gets uris scheme. Note that this will always be all-lowercase, regardless of the string or strings that uri was created from.

since: 2.66

g_uri_get_user

Gets the ‘username’ component of uris userinfo, which may contain %-encoding, depending on the flags with which uri was created. If uri was not created with G_URI_FLAGS_HAS_PASSWORD or G_URI_FLAGS_HAS_AUTH_PARAMS, this is the same as g_uri_get_userinfo().

since: 2.66

g_uri_get_userinfo

Gets uris userinfo, which may contain %-encoding, depending on the flags with which uri was created.

since: 2.66

g_uri_parse_relative

Parses uri_ref according to flags and, if it is a relative URI, resolves it relative to base_uri. If the result is not a valid absolute URI, it will be discarded, and an error returned.

since: 2.66

g_uri_ref

Increments the reference count of uri by one.

since: 2.66

g_uri_to_string

Returns a string representing uri.

since: 2.66

g_uri_to_string_partial

Returns a string representing uri, subject to the options in flags. See g_uri_to_string() and GUriHideFlags for more details.

since: 2.66

g_uri_unref

Atomically decrements the reference count of uri by one.

since: 2.66