Results 1 to 2 of 2

Thread: XPath Expression To Deal With BR Element

  1. #1
    Join Date
    Jul 2010

    XPath Expression To Deal With BR Element

    I have a bunch of HTML documents with a P element and 'id' attribute set to 'title'. Like so:

    HTML Code:
    <p id="title">
    Title of the document
    In some cases, I have a title that has a forced line break:

    HTML Code:
    <p id="title">
    This Is A Title Of A Document<br>With A BR Element In It
    I have created an UpdateAndSynchronize.php document that scans a tree where all my web documents are, loads the document (using DOMDocument::loadHTML()), sets up the XPath object, and extracts the info I want to put in the MySQL database.

    My XPath expression to get the document title is:

    PHP Code:
    $docTitle $htmlXPath->query('.//p[@id="title"]')->item(0)->textContent;
    $docTitle trim(str_replace(array("\n""\r\n""\r""\t"), " "$docTitle)); 
    $htmlXPath is an XPath object.

    I had to add the second line to get rid of leading and trailing whitespace.
    My problem is the str_replace() is not working, because the <br> element in the XPath query is probably being converted (translated?) to some other character.

    The question is:

    How should I be setting up my XPath->query() to convert <br> elements into a single space character?

    Also, is there a good reference (book? web pages?) that show how to set up XPath queries (evaluations?) with lots of examples?

  2. #2
    Join Date
    Jan 2004
    I found this thread it may have your answer

    Natdrip :P
    "water go down the hole" - plucky duck

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
HTML5 Development Center