Click to See Complete Forum and Search --> : Need regex help
aaronbdavis
05-04-2006, 07:30 AM
I am trying to write a regex which will recognize a block of text as being composed of plain text and HTML Comments. The regex I have so far is as follows:$pattern = "/(?P<text>.*?)(<!--(?P<comment>.*?)-->)?/sm"; What I want is to get the first chunk of text into the back-reference <text> and get the text of the comment into the back-reference <comment>. I made <text> lazy, because it tries to take the comment also otherwise, and I made <comment> optional because it will not always be there. The problem, is that will this combination, it now captures nothing.
I cannot figure this out. Can anyone help?
bokeh
05-04-2006, 08:02 AM
Can you provide a block of text and highlight the parts you are trying to capture.
aaronbdavis
05-04-2006, 09:01 AM
this is some text <!-- and this is a comment-->
and this is some more text <!-- and this is another comment --> I want to capture as follows
[text] => this is some text
[comment] => and this is a comment
// this would also be fine
[comment] => <!-- and this is a comment --> I plan on using this in a while (preg_match(... loop, cutting away what I found previously in order to build an array of text and comment nodes.
i.e.Array
{
[1_text] => this is some text
[2_comment] => and this is a comment
[3_text] => and this is some more text
[4_comment] => and this is another comment
}
NogDog
05-04-2006, 11:06 AM
Don't know if this would do the job for you:
<?php
header("Content-type: text/plain");
$text = <<<EOD
This is a test. <!-- This is a comment -->
This is only a test. <!-- This is another comment -->
<!-- This is a comment on its own line -->
This has been a test.
The end.
EOD;
preg_match_all('/<!--\s*(.+)\s*-->/U', $text, $matches);
$nonComment = preg_replace('/<!--\s*(.+)\s*-->/U', "\t", $text);
$data['text'] = explode("\t", $nonComment);
foreach($data['text'] as $key => $val)
{
$data['text'][$key] = trim($val);
}
$data['comment'] = $matches[1];
print_r($data);
?>
aaronbdavis
05-04-2006, 12:30 PM
I think I can make that work. Thanks NogDog.