Saturday, October 9, 2010

XPath tutorial for busy programmer


As standard say: XPath is a language for addressing parts of an XML document. Ok...that is simple enough. XPath is just basically a mean to traverse XML document and perform search on it. We can use structure of the XML document (semantics of data), or on data itself to perform that search. We can use XPath in XML transformations (XSLT), in SOA (BPEL language). And as I can see jQuery use similar logic for its selector search operations.

XML document

XPath can query any part of the XML document (any node at any level - XML documents are treated as trees of nodes). As a result of search, XPath may return null, string, number or another XML node (that can also be queried). XPath is used to navigate through elements or attribute of an XML document.

We will use following XML document:

  Sony PSP
  Nintendo Wii
  Sony PS2

Basic node selection

To navigate through XML document we are using path expressions. The most to common expressions we will use are slashes : single ("/"), or double ("//").

Single slash will perform search from root node. In our XPath search as: "/game-systems/system/type" will return following result (in XMLSpy):

Double slash will perform traverse through XML tree and find out all nodes that match the selection no matter where they are in XML. So selection: "//type" in our example will produce same result as previous example.

Also common expression path to select XML nodes are: "@, ., ..".

"@" is used to select attribute, as in: "/game-systems/system/emulator/@usable", where we select value of usable attribute in system node.

"." will select current node, and ".." parent node. This is similar like selecting file path in file system!

Selecting parent of emulator node (hint: system): "/game-systems/system/emulator/..".

Finding specific node

To find some specific nodes we use predicates. With this construction we can perform search to find node with specific element or attribute value. Also we can extract specific result from node set result (if there is more that one node as result from previously search). Predicates are always embedded in square brackets.

Finding first system/emulator value can be done with following search:

To find all system name with usable emulator we will write:
"/game-systems/system/emulator[@usable='true']/../name" -- here we use ".." as a way to move up to previous element in XML tree.

Finding node-set relative to the current node

We can also use XML tree structure (you know, children, parents and stuff) to find specific nodes.

For example, we can write previous example as follows: "/game-systems/system/emulator[@usable='true']/ancestor::system/name.

This is somewhat longer but it does same thing. Here we are using ancestor function that return ancestor of current element (system in this case). You can also search for child, attributes, descendants and similar searches that you can also perform (in most cases) using basic node selectors and predicates.

No comments:

Post a Comment