Thread: Javascript Regular Expression

    Javascript Regular Expression


    I found a regular expression from a website that says it can be used to remove all HTML tags, but I would like to understand what each of the symbols mean. I tried to search on web but i was so confused about it. Does anyone have a clue what each symbol means and how does it find all html tags?

    the expression is:

    variable.replace(/<\/?[^>]+(>|$)/g, "");

    Thank you so much.

    This regular expression will match any string that contains < followed by zero or one (that is what the ? is for) occurrence of / (since the / is a delimiter for the regular expression it has to be escaped with a backslash \). Then it has to be followed by anything but > at least one times. That is what [^>]+ means. Then it has to be followed by > which is the last symbol of the string (the $). The g at the very end means "perform a global match", i.e. find all matches rather than stopping after the first match. I am not quite sure what the parentheses and the pipe symbol are for.

    Quote Originally Posted by blondie_69 View Post
    I am not quite sure what the parentheses and the pipe symbol are for.
    Either or. So the last match is the end of the string OR >.

    Parentheses are also used to match and can be referred to later. For example, a replace.
    var old = "aba";
    var new = old.replace(/a(b)a/,"$1");
    alert(new); // b.
    Thank you for the quick reply!

    Now I understand how it really works but it leads me to another question, I'm actually trying to create a button/link called "Email This!" that users can just click on it to email the content on a webpage to himself/anyone.

    What I was trying to do is to use javascript to remove all html tags, and then copy all contents of a page to a variable, let say "email_var" for example, and use it as:

    var HTMLcontent;
    var email_var = HTMLcontent.replace(/<\/?[^>]+(>|$)/g, "");

    But there is no response when I click on the "Email This!" link. I'm suspecting that it may have to do with the symbols that I used on the page, like ?, *, !, is that why?

    Does anyone know the answer to this?

    Thank you so much!!

    Well presuming that you have HTMLcontent defined in the real thing, we need to see the whole code to see what is happening. The relevant JavaScript, the form tag, and the link/button in particular.

    Here is the actual code and it is located in an external javascript file called email-this.js: (I have a div element with id "centre" that contains everything on my page that I want to copy and paste onto the email client's body)

    var strTagStrippedText;
    var EmailBody;

    function removeHTMLTags(){
    if(document.getElementById && document.getElementById("centre")){
    var strInputCode = document.getElementById("centre").innerHTML;

    strTagStrippedText = strInputCode.replace(/<\/?[^>]+(>|$)/g, "@");
    var EmailBodyTemp1 = strTagStrippedText.replace(/@@/g, "");
    var EmailBodyTemp2 = EmailBodyTemp1.replace(/-->/g, "");
    var EmailBodyTemp3 = EmailBodyTemp2.replace(/@/g, "%0A");

    EmailBody = EmailBodyTemp3.replace(/\|\*/g, "");


    And Here is the html file:

    In Head:
    <script type="text/javascript" src="../email-this.js"></script>

    In Body:
    <a href="#" onmouseover="EmailOver();" onmouseout="EmailNormal();" onclick="removeHTMLTags()" >
    <img id="EmailThis" border="0" src="../../images/EmailNormal.png" alt="Email This" /></a>

    The problem is still there...nothing happens when I click on the "Email This" button...I'm wondering would it be the template that is causing the problem? I used a .dwt template file on this page, and there are a number of <!--InstanceBeginEditable> and InstanceEndEditable tags throughout the page.

    Thank you so much once again!

    Does anyone know how to fix the problem????

    It would reeeally help 'cause I'm totally stuck here now -__-

    Thank you so much!

