I found a regular expression from a website that says it can be used to remove all HTML tags, but I would like to understand what each of the symbols mean. I tried to search on web but i was so confused about it. Does anyone have a clue what each symbol means and how does it find all html tags?
This regular expression will match any string that contains < followed by zero or one (that is what the ? is for) occurrence of / (since the / is a delimiter for the regular expression it has to be escaped with a backslash \). Then it has to be followed by anything but > at least one times. That is what [^>]+ means. Then it has to be followed by > which is the last symbol of the string (the $). The g at the very end means "perform a global match", i.e. find all matches rather than stopping after the first match. I am not quite sure what the parentheses and the pipe symbol are for.
Now I understand how it really works but it leads me to another question, I'm actually trying to create a button/link called "Email This!" that users can just click on it to email the content on a webpage to himself/anyone.
What I was trying to do is to use javascript to remove all html tags, and then copy all contents of a page to a variable, let say "email_var" for example, and use it as:
var HTMLcontent;
var email_var = HTMLcontent.replace(/<\/?[^>]+(>|$)/g, "");
window.location='mailto:?body='+email_var;
But there is no response when I click on the "Email This!" link. I'm suspecting that it may have to do with the symbols that I used on the page, like ?, *, !, is that why?
Well presuming that you have HTMLcontent defined in the real thing, we need to see the whole code to see what is happening. The relevant JavaScript, the form tag, and the link/button in particular.
Great wit and madness are near allied, and fine a line their bounds divide.
Here is the actual code and it is located in an external javascript file called email-this.js: (I have a div element with id "centre" that contains everything on my page that I want to copy and paste onto the email client's body)
var strTagStrippedText;
var EmailBody;
function removeHTMLTags(){
if(document.getElementById && document.getElementById("centre")){
var strInputCode = document.getElementById("centre").innerHTML;
strTagStrippedText = strInputCode.replace(/<\/?[^>]+(>|$)/g, "@");
var EmailBodyTemp1 = strTagStrippedText.replace(/@@/g, "");
var EmailBodyTemp2 = EmailBodyTemp1.replace(/-->/g, "");
var EmailBodyTemp3 = EmailBodyTemp2.replace(/@/g, "%0A");
The problem is still there...nothing happens when I click on the "Email This" button...I'm wondering would it be the template that is causing the problem? I used a .dwt template file on this page, and there are a number of <!--InstanceBeginEditable> and InstanceEndEditable tags throughout the page.
Bookmarks