Method
GLibMatchInfofetch_pos
since: 2.14
Declaration [src]
gboolean
g_match_info_fetch_pos (
const GMatchInfo* match_info,
gint match_num,
gint* start_pos,
gint* end_pos
)
Description [src]
Returns the start and end positions (in bytes) of a successfully matching capture parenthesis.
Valid values for match_num
are 0
for the full text of the match,
1
for the first paren set, 2
for the second, and so on.
As end_pos
is set to the byte after the final byte of the match (on success),
the length of the match can be calculated as end_pos - start_pos
.
As a best practice, initialize start_pos
and end_pos
to identifiable
values, such as G_MAXINT
, so that you can test if
g_match_info_fetch_pos()
actually changed the value for a given
capture parenthesis.
The parameter match_num
corresponds to a matched capture parenthesis. The
actual value you use for match_num
depends on the method used to generate
match_info
. The following sections describe those methods.
Methods Using Non-deterministic Finite Automata Matching
The methods g_regex_match()
and g_regex_match_full()
return a GMatchInfo
using traditional (greedy) pattern
matching, also known as
Non-deterministic Finite Automaton
(NFA) matching. You pass the returned GMatchInfo
from these methods to
g_match_info_fetch_pos()
to determine the start and end positions
of capture parentheses. The values for match_num
correspond to the capture
parentheses in order, with 0
corresponding to the entire matched string.
match_num
can refer to a capture parenthesis with no match. For example,
the string b
matches against the pattern (a)?b
, but the capture
parenthesis (a)
has no match. In this case, g_match_info_fetch_pos()
returns true and sets start_pos
and end_pos
to -1
when called with
match_num
as 1
(for (a)
).
For an expanded example, a regex pattern is (a)?(.*?)the (.*)
,
and a candidate string is glib regexes are the best
. In this scenario
there are four capture parentheses numbered 0–3: an implicit one
for the entire string, and three explicitly declared in the regex pattern.
Given this example, the following table describes the return values
from g_match_info_fetch_pos()
for various values of match_num
.
match_num |
Contents | Return value | Returned start_pos |
Returned end_pos |
---|---|---|---|---|
0 | Matches entire string | True | 0 | 25 |
1 | Does not match first character | True | -1 | -1 |
2 | All text before the |
True | 0 | 17 |
3 | All text after the |
True | 21 | 25 |
4 | Capture paren out of range | False | Unchanged | Unchanged |
The following code sample and output implements this example.
#include <glib.h>
int
main (int argc, char *argv[])
{
g_autoptr(GError) local_error = NULL;
const char *regex_pattern = "(a)?(.*?)the (.*)";
const char *test_string = "glib regexes are the best";
g_autoptr(GRegex) regex = NULL;
regex = g_regex_new (regex_pattern,
G_REGEX_DEFAULT,
G_REGEX_MATCH_DEFAULT,
&local_error);
if (regex == NULL)
{
g_printerr ("Error creating regex: %s\n", local_error->message);
return 1;
}
g_autoptr(GMatchInfo) match_info = NULL;
g_regex_match (regex, test_string, G_REGEX_MATCH_DEFAULT, &match_info);
int n_matched_strings = g_match_info_get_match_count (match_info);
// Print header line
g_print ("match_num Contents Return value returned start_pos returned end_pos\n");
// Iterate over each capture paren, including one that is out of range as a demonstration.
for (int match_num = 0; match_num <= n_matched_strings; match_num++)
{
gboolean found_match;
g_autofree char *paren_string = NULL;
int start_pos = G_MAXINT;
int end_pos = G_MAXINT;
found_match = g_match_info_fetch_pos (match_info,
match_num,
&start_pos,
&end_pos);
// If no match, display N/A as the found string.
if (start_pos == G_MAXINT || start_pos == -1)
paren_string = g_strdup ("N/A");
else
paren_string = g_strndup (test_string + start_pos, end_pos - start_pos);
g_print ("%-9d %-25s %-12d %-18d %d\n", match_num, paren_string, found_match, start_pos, end_pos);
}
return 0;
}
match_num Contents Return value returned start_pos returned end_pos
0 glib regexes are the best 1 0 25
1 N/A 1 -1 -1
2 glib regexes are 1 0 17
3 best 1 21 25
4 N/A 0 2147483647 2147483647
Methods Using Deterministic Finite Automata Matching
The methods g_regex_match_all()
and
g_regex_match_all_full()
return a GMatchInfo
using
Deterministic Finite Automaton
(DFA) pattern matching. This algorithm detects overlapping matches. You pass
the returned GMatchInfo
from these methods to g_match_info_fetch_pos()
to determine the start and end positions of each overlapping match. Use the
method g_match_info_get_match_count()
to determine the number
of overlapping matches.
For example, a regex pattern is <.*>
, and a candidate string is
<a> <b> <c>
. In this scenario there are three implicit capture
parentheses: one for the entire string, one for <a> <b>
, and one for <a>
.
Given this example, the following table describes the return values from
g_match_info_fetch_pos()
for various values of match_num
.
match_num |
Contents | Return value | Returned start_pos |
Returned end_pos |
---|---|---|---|---|
0 | Matches entire string | True | 0 | 11 |
1 | Matches <a> <b> |
True | 0 | 7 |
2 | Matches <a> |
True | 0 | 3 |
3 | Capture paren out of range | False | Unchanged | Unchanged |
The following code sample and output implements this example.
#include <glib.h>
int
main (int argc, char *argv[])
{
g_autoptr(GError) local_error = NULL;
const char *regex_pattern = "<.*>";
const char *test_string = "<a> <b> <c>";
g_autoptr(GRegex) regex = NULL;
regex = g_regex_new (regex_pattern,
G_REGEX_DEFAULT,
G_REGEX_MATCH_DEFAULT,
&local_error);
if (regex == NULL)
{
g_printerr ("Error creating regex: %s\n", local_error->message);
return -1;
}
g_autoptr(GMatchInfo) match_info = NULL;
g_regex_match_all (regex, test_string, G_REGEX_MATCH_DEFAULT, &match_info);
int n_matched_strings = g_match_info_get_match_count (match_info);
// Print header line
g_print ("match_num Contents Return value returned start_pos returned end_pos\n");
// Iterate over each capture paren, including one that is out of range as a demonstration.
for (int match_num = 0; match_num <= n_matched_strings; match_num++)
{
gboolean found_match;
g_autofree char *paren_string = NULL;
int start_pos = G_MAXINT;
int end_pos = G_MAXINT;
found_match = g_match_info_fetch_pos (match_info, match_num, &start_pos, &end_pos);
// If no match, display N/A as the found string.
if (start_pos == G_MAXINT || start_pos == -1)
paren_string = g_strdup ("N/A");
else
paren_string = g_strndup (test_string + start_pos, end_pos - start_pos);
g_print ("%-9d %-25s %-12d %-18d %d\n", match_num, paren_string, found_match, start_pos, end_pos);
}
return 0;
}
match_num Contents Return value returned start_pos returned end_pos
0 <a> <b> <c> 1 0 11
1 <a> <b> 1 0 7
2 <a> 1 0 3
3 N/A 0 2147483647 2147483647
Available since: 2.14
Parameters
match_num
-
Type:
gint
Number of the capture parenthesis.
start_pos
-
Type:
gint*
Pointer to location where to store the start position, or
NULL
.The argument will be set by the function. The argument can be NULL
. end_pos
-
Type:
gint*
Pointer to location where to store the end position (the byte after the final byte of the match), or
NULL
.The argument will be set by the function. The argument can be NULL
.
Return value
Type: gboolean
True if match_num
is within range, false otherwise. If
the capture paren has a match, start_pos
and end_pos
contain the
start and end positions (in bytes) of the matching substring. If the
capture paren has no match, start_pos
and end_pos
are -1
. If
match_num
is out of range, start_pos
and end_pos
are left unchanged.