[prev in list] [next in list] [prev in thread] [next in thread]
List: php-doc-bugs
Subject: [DOC-BUGS] #46904 [Opn->Bgs]: preg_match() example #4 is wrong
From: felipe () php ! net
Date: 2008-12-23 19:07:31
Message-ID: 200812231907.mBNJ7VAl061212 () y1 ! php ! net
[Download RAW message or body]
ID: 46904
Updated by: felipe@php.net
Reported By: joe at digg dot com
-Status: Open
+Status: Bogus
Bug Type: Documentation problem
Operating System: Debian GNU/Linux
PHP Version: Irrelevant
New Comment:
Says the PCRE documentation:
"In PCRE, a subpattern can be named in one of three ways: (?<name>...)
or (?'name'...) as in Perl, or (?P<name>...) as in Python."
Previous Comments:
------------------------------------------------------------------------
[2008-12-19 10:16:59] rquadling@php.net
According to the help for RegexBuddy ...
(?P<name>group) came from Python.
The PCRE followed Python's lead.
PHP offers the same functionality
So, initially you look correct.
But, again from the RegexBuddy help ...
"The regular expression classes of the .NET framework also support
named capture. Unfortunately, the Microsoft developers decided to
invent their own syntax, rather than follow the one pioneered by
Python. Currently, no other regex flavor supports Microsoft's version
of named capture.
Here is an example with two capturing groups in .NET style: (?
<first>group)(?'second'group). As you can see, .NET offers two
syntaxes to create a capturing group: one using sharp brackets, and
the other using single quotes. The first syntax is preferable in
strings, where single quotes may need to be escaped. The second
syntax is preferable in ASP code, where the sharp brackets are used
for HTML tags. You can use the pointy bracket flavor and the quoted
flavors interchangeably.
To reference a capturing group inside the regex, use \k<name> or
\k'name'. Again, you can use the two syntactic variations
interchangeably."
This info is also available on http://www.regular-
expressions.info/named.html
So, it seems PHP actually supports PCRE/Python's and Microsoft's
mechanisms.
Ideally we should be reflecting the PCRE route but have a note that
other mechanisms are supported.
Finally on this (from http://perldoc.perl.org/perlre.html - scroll
down to "Capture Buffers").
"Additionally, as of Perl 5.10.0 you may use named capture buffers
and named backreferences. The notation is (?<name>...) to declare and
\k<name> to reference. You may also use apostrophes instead of angle
brackets to delimit the name; and you may use the bracketed \g{name}
backreference syntax. It's possible to refer to a named capture
buffer by absolute and relative number as well. Outside the pattern,
a named capture buffer is available via the %+ hash. When different
buffers within the same pattern have the same name, $+{name} and
\k<name> refer to the leftmost defined group. (Thus it's possible to
do things with named capture buffers that would otherwise require (??
{}) code to accomplish.)"
So, there is a differentiation between named captures and named
backreferences.
(?<name>regex>) is a named capture. You cannot use the name of the
capture within the regex or the replace (if search/replacing).
So, technically and being ever so slightly picky, the documentation
is correct.
But really it is incomplete. I'll try and put some more examples in
differentiating between named captures and named backreferences.
------------------------------------------------------------------------
[2008-12-19 10:05:34] rquadling@php.net
I'm not so sure.
Using RegexBuddy to explain the different Regexs ...
There seems to be no difference between the 2 forms.
(?<name>\w+): (?<digit>\d+)
Options: case insensitive; ^ and $ match at line breaks
Match the regular expression below and capture its match into
backreference with name "name" «(?<name>\w+) »
Match a single character that is a "word character" (letters,
digits, etc.) «\w+ »
Between one and unlimited times, as many times as possible,
giving back as needed (greedy) «+ »
Match the characters ": " literally «: »
Match the regular expression below and capture its match into
backreference with name "digit" «(?<digit>\d+) »
Match a single digit 0..9 «\d+ »
Between one and unlimited times, as many times as possible,
giving back as needed (greedy) «+ »
(?P<name>\w+): (?P<digit>\d+)
Options: case insensitive; ^ and $ match at line breaks
Match the regular expression below and capture its match into
backreference with name "name" «(?P<name>\w+) »
Match a single character that is a "word character" (letters,
digits, etc.) «\w+ »
Between one and unlimited times, as many times as possible,
giving back as needed (greedy) «+ »
Match the characters ": " literally «: »
Match the regular expression below and capture its match into
backreference with name "digit" «(?P<digit>\d+) »
Match a single digit 0..9 «\d+ »
Between one and unlimited times, as many times as possible,
giving back as needed (greedy) «+ »
------------------------------------------------------------------------
[2008-12-18 20:21:24] tobias382 at gmail dot com
Patch for /phpdoc/en/reference/pcre/functions/preg-match.xml:
278c278
< preg_match('/(?<name>\w+): (?<digit>\d+)/', $str, $matches);
---
> preg_match('/(?P<name>\w+): (?P<digit>\d+)/', $str, $matches);
------------------------------------------------------------------------
[2008-12-18 20:08:21] joe at digg dot com
Description:
------------
On http://us.php.net/preg_match example #4 (Using named subpattern) is
wrong. It shows:
<?php
$str = 'foobar: 2008';
preg_match('/(?<name>\w+): (?<digit>\d+)/', $str, $matches);
print_r($matches);
?>
The proper syntax for named expressions is (?P<foo>).
Expected result:
----------------
<?php
$str = 'foobar: 2008';
preg_match('/(?P<name>\w+): (?P<digit>\d+)/', $str, $matches);
print_r($matches);
?>
------------------------------------------------------------------------
--
Edit this bug report at http://bugs.php.net/?id=46904&edit=1
--
PHP Documentation Bugs Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic