Database Journal: The Knowledge Center for Database Professionals
|
|
|
A WebDeveloper.com Feature
|
Understanding the DOM
Part 2
by Nate Zelnick
It Figures
The figure below shows the hierarchy of node objects and the types of node objects each can create. A definition of each node object is provided below the figure.
Document
Represents the XML or HTML document as a whole. Considered the "root" of the tree and used to represent all of the descendent nodes.
DocumentFragment
A node that represents a portion of a larger tree removed from its context. Used when copying part of a larger structure into another document. DocumentFragments contain all of the descendents of their top-level element, which makes them a convenient way to copy a block of related structures at once.
DocumentType
Represents the location of a Document Type Definition file that describes the entities and structure of the document abstractly. This is optional.
EntityReference
Used when the source for an entity (a symbolic representation of something) is in the document itself and not in an external DTD. Examples of entities are notations for high-level ASCII characters using the &..; value.
Element
This is where most authors will do most of their work. Every tag in an HTML or XML tree is an element. Elements can have Attributes (see below) that are always strings in HTML, but may be other data types in XML.
Attribute
Attributes are data about Elements, but are not considered children of Elements. What this means practically is that attributes are not really part of the hierarchical tree of a document. Getting attribute values requires a separate method call from a simple query for child or parent element values. Think of attributes as properties of an Element that live outside of the hierarchy itself but provide some useful context for the Element they describe.
ProcessingInstruction
Abbreviated as PI in most discussions. A PI is an escape that provides specific clues for a particular "processor" of the document. The idea of a processor is more generic than a chip. It refers to a particular application that may expose extra interfaces above and beyond the standard DOM. If the document is being used in a different context than that referred to by the PI, the PI is ignored. A PI is a leaf node.
Comment
Exactly what it looks like. A leaf node that contains comments.
Text
A node object that contains a string. The string in the Element Node NAME (below) is a Text node. This is a leaf node.
<NAME>Joe</NAME>
CDATASection
Used to escape some text that would otherwise be parsed as a set of other elements. For instance, if a Text node needed to contain content that represented HTML-structured text, it would have to be saved as a CDATASection node object type. This is a leaf node.
Entity
A symbolic representation of something. This is not the entity declaration, which is outside of the DOM’s purview because it requires some platform-specific knowledge. This is a leaf node.
Notation
Refers to some specified notation in a Document Type Definition. This is a leaf node.
I strongly recommend that you read through the spec itself to get a better view of node types and the DOM in general. I could probably devote a column to each idea presented here, since there are a lot subtleties and abstractions that deserve discussion. But barring that, this high-level view of terms and the node types is the anchor of how you’ll be able to query a document for its structure and walk through a hierarchy through the DOM interface methods, the topic we’ll deal with next time.
Contact the WebDeveloper.com® staff
Last modified:
Friday, 25-Feb-2011 12:28:06 EST
|
|
Refresh Daily
Join Editor-in-Chief David Fiedler
and find truth, justice, and a clue or two.
|