Click to See Complete Forum and Search --> : XPath not finding <br>


acrymble
06-12-2008, 03:03 PM
I am trying to split the following HTML content into an array that will look like this:

0 => Gladman (family)
1 => Gladman, George, 1765-1821
2 => Gladman, George, 1800-1863
3 => Gladman, Henry, 1834-1912


<TABLE CLASS="results">
<TR><TH>Creator:</TH><TD CLASS="data">Gladman (family)<br>Gladman, George, 1765-1821<br>Gladman, George, 1800-1863<br>Gladman, Henry, 1834-1912</TD></TR>

I have tried:

xPath1 = doc.evaluate(//table[@class="results"]/tbody/tr/td', doc, nsResolver, XPathResult.ANY_TYPE, null).iterateNext().textContent.split(/\n/));

This returns the correct content, but does not split it at all. The result is an array with 1 item.

I also tried the regExp (/\r/) to no avail.

I think that the xPath is not recognizing the <br> tags. Is there a way to split this?

Thanks

Adam

Declan1991
06-12-2008, 03:21 PM
Can you not just, no?
xPath1 = doc.evaluate(//table[@class="results"]/tbody/tr/td', doc, nsResolver, XPathResult.ANY_TYPE, null).iterateNext().textContent.split(/\<br\>/));
And can I ask why you are using XPath with something that is neither XML or XHTML (i.e. <br> instead of <br />)?

acrymble
06-12-2008, 03:29 PM
Declan1991, (/\<br\>/) didn't work. it returned the same thing as (/\n/).

I am using xPath because it is the only way I can figure out how to scrape the content off the page. It's not my webpage that the content is on, so I'm stuck with the formatting they have used.

Declan1991
06-12-2008, 04:51 PM
It is a HTML page right? Then why not:
var ts = document.getElementsByTagName("table"), l = ts.length, ar = [];
while ((i--)) {
if (ts[i].className&&ts[i].className.match("results")) {
ar[ar.length] = ts[i].getElementsByTagName("td")[0].split(/\<br\>/);
}
}
alert(ar[0][0]);