aaronbdavis
03-20-2006, 06:20 PM
I am having some trouble with named Regex groups.
I looked for this on Google.com and in the forums here and could not find an answer:
My Question: How do you negate a named group?
The specific instance I am working on is as follows:
I am trying to collect HTML attributes from a string of an HTML element.
The current regex I am using works, but is not as nuanced as I would like:
\s*(\w+)\s*=\s*(?P<quote>['"])([^'"])(?P=quote)
This will look for a single or double quote, followed by something that is not a quote symbol and ended with the same symbol that began the match.
e.g. it will match foo="bar" and bar='foo' but will not match name="Bob 'Foobar' Johnson".
therefore, what I want, is to look for text, encapsed by either single or double quotes, but does not contain the quote charater in which it is encapsed. Logic says that I should negate the named group quote, but this doesn't seem to work in the obvious way; when I rewrite the regex as \s*(\w+)\s*=\s*(?P<quote>['"])([^(?P=quote)])(?P=quote) the string doesn't match at all.
Is there a Regex Guru here who knows how to solve my problem?
I looked for this on Google.com and in the forums here and could not find an answer:
My Question: How do you negate a named group?
The specific instance I am working on is as follows:
I am trying to collect HTML attributes from a string of an HTML element.
The current regex I am using works, but is not as nuanced as I would like:
\s*(\w+)\s*=\s*(?P<quote>['"])([^'"])(?P=quote)
This will look for a single or double quote, followed by something that is not a quote symbol and ended with the same symbol that began the match.
e.g. it will match foo="bar" and bar='foo' but will not match name="Bob 'Foobar' Johnson".
therefore, what I want, is to look for text, encapsed by either single or double quotes, but does not contain the quote charater in which it is encapsed. Logic says that I should negate the named group quote, but this doesn't seem to work in the obvious way; when I rewrite the regex as \s*(\w+)\s*=\s*(?P<quote>['"])([^(?P=quote)])(?P=quote) the string doesn't match at all.
Is there a Regex Guru here who knows how to solve my problem?