Click to See Complete Forum and Search --> : Regex named groups


aaronbdavis
03-20-2006, 06:20 PM
I am having some trouble with named Regex groups.
I looked for this on Google.com and in the forums here and could not find an answer:
My Question: How do you negate a named group?

The specific instance I am working on is as follows:
I am trying to collect HTML attributes from a string of an HTML element.
The current regex I am using works, but is not as nuanced as I would like:
\s*(\w+)\s*=\s*(?P<quote>['"])([^'"])(?P=quote)

This will look for a single or double quote, followed by something that is not a quote symbol and ended with the same symbol that began the match.
e.g. it will match foo="bar" and bar='foo' but will not match name="Bob 'Foobar' Johnson".
therefore, what I want, is to look for text, encapsed by either single or double quotes, but does not contain the quote charater in which it is encapsed. Logic says that I should negate the named group quote, but this doesn't seem to work in the obvious way; when I rewrite the regex as \s*(\w+)\s*=\s*(?P<quote>['"])([^(?P=quote)])(?P=quote) the string doesn't match at all.

Is there a Regex Guru here who knows how to solve my problem?

NogDog
03-20-2006, 09:54 PM
Best I can come up with is to use the numeric reference:

$regexp = <<<EOD
/\s*(\w+)\s*=\s*(?P<quote>[\'"])([^\1]*)(?P=quote)/isU
EOD;

aaronbdavis
03-21-2006, 07:25 AM
hmm... ok, thanks for trying. Gonna have to find a different way to go about it.