www.webdeveloper.com
Results 1 to 8 of 8

Thread: reversing matching?

  1. #1
    Join Date
    Jul 2006
    Location
    Vermont
    Posts
    36

    reversing matching?

    I'm not at all fluent in regular expressions, so am looking for a little guidance-
    The following seems to match all html tags in a block of content:

    (<[^<>]+>)


    However, what i need to do is just the opposite - i need to target everything other thanthe html tags. Is there an easy way to reverse , or negate the above?

  2. #2
    Join Date
    Dec 2005
    Location
    FL
    Posts
    7,330

  3. #3
    Join Date
    Jul 2006
    Location
    Vermont
    Posts
    36

    thanks, but..

    Thanks for the leads but what i need is just the opposite. Those examples all target the html tags themselves, and replace them with empty or white space.

    What i need is to target all the rest of the content - everything BUT the tags - as I need to perform other processing on the remaining text before inserting it back between the tags.

  4. #4
    Join Date
    Dec 2005
    Location
    FL
    Posts
    7,330

    Question

    I don't understand ...
    Are you asking to remove the text and display only the tags?

    Can you give an example of what you see originally and what you want to see after the action is performed?

    Using one of the earlier references, here is my example. How does it differ from what you want to do?
    Code:
    <HTML>
    <HEAD>
    <TITLE> Regular Expressions: stripping HTML tags </TITLE>
    <SCRIPT type="text/javascript">
    // For: http://www.webdeveloper.com/forum/showthread.php?p=1115200#post1115200
    function StripTags() {
      var htstring = document.getElementById('TAsrc').value;
      var stripped = htstring.replace(/(<([^>]+)>)/ig,"");
      document.getElementById('TAdes').value = stripped;
    }  
    </SCRIPT>
    </HEAD>
    <BODY>
    <textarea id="TAsrc" rows="9" cols="80">
    <html>
    <head>
    <title>Example</title>
    </head>
    <body>
    <h1>Strip HTML Tags Test</h1>
    <b><i>This is just a test</i></b>
    </body>
    </html>
    </textarea><br>
    <button onclick="StripTags()">Strip Tags</button>
    <button onclick="document.getElementById('TAsrc').value = '';">Clear</button>
    <br>
    <textarea id="TAdes" rows="9" cols="80"></textarea>
    </BODY>
    </HTML>
    Last edited by JMRKER; 09-22-2010 at 10:59 PM. Reason: Added example code

  5. #5
    Join Date
    Jan 2003
    Location
    Texas
    Posts
    10,413
    Hi,

    Looks like you're looking for some kind of
    strip_tags function, perhaps?
    Visit Slightly Remarkable to see my portfolio, resumé, and consulting rates.

  6. #6
    Join Date
    Jul 2006
    Location
    Vermont
    Posts
    36
    Sorry for not making this clearer, no - NOT looking to strip html tags. ...just looking to target content between them (not inside them) - hence my original post asked how to make the posed regex do the opposite.
    Certainly, stripping tags would make the latter processing easier - but then we'd lose all pre-exisiting formatting.

    The example code below doesn't quite do what i need as replace method doesn't "Replace All". I need to use a regex to be able to apply its global modifier. However, since the sampleInput might contain an html tag with searchFor in it, we don't wish to have the replace() affect any pre-existing html tags. It should ONLY apply to content between tags rather than inside them.

    Here's what it SHOULD look like (notice the first occurrence of "amp" is ignored since its inside a tag in the sampleInput string):
    This sample sentence is an example of the many ample ways to amplify content.


    Here's a functional example that might make it clearer to see:
    Code:
    <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
    <html xmlns="http://www.w3.org/1999/xhtml">
    <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
    <title>Untitled</title>
    
    <style type="text/css">
    <!--
    .blue{color: blue;}
    -->
    </style>
    
    <script type="text/javascript" language="javascript">
    <!--
    function highlightTextNodes() {
      var sampleInput ='<p>This <span class="blue">sample</span> sentence is an example of the many ample ways to amplify content</p>';
      
      var searchFor = 'amp';
      
      var newHtml = sampleInput.replace(searchFor,'<span style="color:red;font-weight:bold;">'+searchFor+'</span>');
    	
      var el = document.getElementById("demo");
      el.innerHTML = newHtml;
    }
    -->
    </script>
    </head>
    
    <body onload="highlightTextNodes()">
    	<div id="demo">
    	</div>
    </body>
    </html>
    Last edited by stride-r; 09-23-2010 at 09:40 AM.

  7. #7
    Join Date
    Dec 2005
    Location
    FL
    Posts
    7,330

    Exclamation Almost, but not quite ...

    This changes all, including the first one...
    Code:
    <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
    <html xmlns="http://www.w3.org/1999/xhtml">
    <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
    <title>TextHighlite</title>
    
    <style type="text/css">
      .blue{color: blue;}
    </style>
    
    <script type="text/javascript">
    function highlightTextNodes() {
      var sampleInput ='<p>This <span class="blue">sample</span> sentence is an example of the many ample ways to amplify content</p>';
      
      var searchFor = /amp/ig;
      var newHtml = sampleInput.replace(/amp/ig,'<span style="color:red;font-weight:bold;">'+'amp'+'</span>');
    
      var el = document.getElementById("demo");
      el.innerHTML = newHtml;
    }
    </script>
    </head>
    
    <body onload="highlightTextNodes()">
    <div id="demo">	</div>
    </body>
    </html>

  8. #8
    Join Date
    Jul 2006
    Location
    Vermont
    Posts
    36
    Thanks - that's sort of the right idea ....except that in your example, the 'amp' is hard coded. In real life this would obviously be a variable and since you're still targeting everything (still not excluding the html tags), if "amp" were to happen to be "p", or "<", or any other character or phrase possibly found INSIDE a <> tag, then it breaks all.

    Try your demo file using "p" instead of "amp" to see what i mean.

    Maybe it can't be done via regex ...but i've seen some pretty fancy work using regex so i suspect its a matter of getting the right combo to make it work.

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
HTML5 Development Center



Recent Articles