04-xml_XPath
04-xml_XPath
Introduction
XPath stands for XML Path Language
XPath uses "path expressions" syntax to
identify and navigate nodes in an XML
document
XPath contains over 200 built-in functions
XPath is a major element in the XSLT
standard
XPath is a W3C recommendation
2
Introduction
XPath is used in other XML technologies:
◦ XML Schemas (expression of uniqueness and key
constraints),
◦ XSLT transforms,
◦ Xquery
◦ Xlink
◦ XPointer, etc.
3
XPath Terminology
Nodes
◦ In XPath, there are seven kinds of nodes:
element, attribute, text,
namespace, processing-instruction,
comment, and root node.
◦ XML documents are treated as trees of nodes.
◦ The topmost element of the tree is called the root
element.
Atomic values
◦ Atomic values are nodes with no children or parent.
Items
◦ Items are atomic values or nodes.
4
Examples of Items
<?xml version="1.0" encoding="UTF-8"?>
<bookstore>
<book>
<title lang="en">Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
</bookstore>
6
Path Expression
A Path Expression is:
A traversal of the document tree :
◦ from a starting node
◦ to a set of target nodes
◦ the targets constitute the value of the path
Node sequence :
◦ T1.T2. ... .Tn
Returns one or more nodes Tn, such that there are arcs:
◦ T1 T2, ... Tn-1 Tn,
db
db.Book.Author
Book Book
Author Author
Author Title Author A1 A2
A1 T1 A2 7
Relationship of Nodes
Each element and attribute has one parent.
the book element is the parent of the title,
author, year, and price
<bookstore>
<book>
<title>Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
</bookstore>
8
Relationship of Nodes
Element nodes may have zero, one or more
children.
the title, author, year, and price elements are
all children of the book element
<bookstore>
<book>
<title>Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
</bookstore>
9
Relationship of Nodes
Nodes that have the same parent are
Siblings
The title, author, year, and price elements are
all siblings
<bookstore>
<book>
<title>Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
</bookstore>
10
Relationship of Nodes
Node's parents, parent's parents, etc are
Ancestors
The ancestors of the title element are the
book element and the bookstore element
<bookstore>
<book>
<title>Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
</bookstore>
11
Relationship of Nodes
Node's children, children's children, etc are
Descendants
Descendants of the bookstore element are
the book, title, author, year, and price
elements
<bookstore>
<book>
<title>Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
</bookstore>
12
Selecting Nodes
XPath uses path expressions to select nodes in an
XML document.
The node is selected by following a path or steps.
The most useful path expressions are listed below:
Expression Description
nodename Selects all nodes with the name "nodename"
/ Selects from the root node
// Selects nodes in the document from the current node
that match the selection no matter where they are
. Selects the current node
.. Selects the parent of the current node
@ Selects attributes
13
Some path expressions and the result
Path Expression Result
//book Selects all book elements no matter where they are in the
document
bookstore//book Selects all book elements that are descendant of the
bookstore element, no matter where they are under the
bookstore element
14
Predicates
Predicates are used to find a specific node
or a node that contains a specific value.
Predicates are always embedded in square
brackets.
15
Some path expressions with predicates
and the result
Path Expression Result
/bookstore/book[1] Selects the first book element that is the child
of the bookstore element.
Note: In IE 5,6,7,8,9 first node is[0], but
according to W3C, it is [1]. To solve this
problem in IE, set the SelectionLanguage to
XPath:
In JavaScript:
xml.setProperty("SelectionLanguage","XPath");
/bookstore/book[last()] Selects the last book element that is the child
of the bookstore element
17
Selecting Unknown Nodes
XPath wildcards can be used to select unknown XML nodes.
Wildcard Description
* Matches any element node
@* Matches any attribute node
node() Matches any node of any kind
19
Location Path Expression
A location path consists of one or more
steps, each separated by a slash
A location path can be absolute or relative.
◦ An absolute location path starts with a slash "/"
◦ A relative location path does not.
An absolute location path:
/step/step/...
step/step/...
20
Step
Each step is evaluated against the nodes in
the current node-set.
A step consists of:
◦ an axis (defines the tree-relationship between the
selected nodes and the current node)
◦ a node-test (identifies a node within an axis)
◦ zero or more predicates (to further refine the
selected node-set)
The syntax for a location step is:
axisname::nodetest[predicate1]… [predicateN]
21
XPath Axes
An axis represents a relationship to the context (current) node,
It is used to locate nodes relative to the context node on the tree.
Optional (by default child)
AxisName Result
ancestor Selects all ancestors (parent, grandparent, etc.) of the
current node
ancestor-or-self Selects all ancestors (parent, grandparent, etc.) of the
current node and the current node itself
attribute Selects all attributes of the current node
child Selects all children of the current node
descendant Selects all descendants (children, grandchildren, etc.) of the
current node
23
Filtres
24
Filtres
Two ways to filter the nodes of an axis:
◦ By their name
For nodes that have a name (Element, Attribute,
ProcessingInstruction)
* : any name
◦ By their type
text() : text nodes
comment() : comment nodes
processing-instruction() : ProcessingInstruction nodes
node() : all node types
25
Predicates
Are optional
Describe additional filtering
Conditions (combined by logical
operators) to be satisfied by the nodes
Additional conditions for selecting nodes
among those retained by the filter in the
axis.
26
Predicates
28
XPath Standard Functions
XPath includes over 200 built-in functions.
29
XPath Standard Functions
Many functions, here some of the most important:
For Nodes
◦ count(expr): number of nodes in the set produced by the
expression (expr)
◦ name(): context node name
local-name(), namespace-uri(): name components having a namespace
For strings
◦ concat(ch1, ch2, …): concatenation
◦ contains(ch1, ch2): checks if ch1 contains ch2
◦ substring(ch, pos, l): extract from ch the substring of length l
starting at position pos (positions start at 1)
◦ string-length(ch): string length
30
XPath Standard Functions
For Booleans
◦ true(), false(): true/false values
◦ not(expr): negation of logical expression
For numerics
◦ floor(n), ceiling(n), round(n): rounding functions
rounded for node value
◦ sum(expr), avg(expr): sum, average of the numerical
values of the nodes of the set produced by the
expression (expr)
31
XPath Standard Functions
There are functions without parameters
but linked to the current node
position : the number of the current node in
the list of considered nodes;
last: the last node in the list of considered
nodes.
32
XPath Operators
An XPath expression returns either a node-set, a string, a
Boolean, or a number.
Operator Description Example
| Computes two node-sets //book | //cd
+ / - / * / div Addition / Subtraction / Multiplication / Division 6 + 4 / 6 - 4 / 6 * 4 / 8 div 4
= Equal price=9.80
!= Not equal price!=9.80
< Less than price<9.80
<= Less than or equal to price<=9.80
> Greater than price>9.80
>= Greater than or equal to price>=9.80
or or price=9.80 or price=9.70
and and price>9.00 and price<9.90
mod Modulus (division remainder) 5 mod 2
33
Source
https://www.w3schools.com/
34
XPath Exapmles
<?xml version="1.0" encoding="UTF-8"?> <book category="web">
<title lang="en">XQuery Kick Start</title>
<bookstore> <author>James McGovern</author>
<author>Per Bothner</author>
<book category="cooking"> <author>Kurt Cagle</author>
<title lang="en">Everyday Italian</title> <author>James Linn</author>
<author>Giada De Laurentiis</author> <author>Vaidyanathan Nagarajan</author>
<year>2005</year> <year>2003</year>
<price>30.00</price> <price>49.99</price>
</book> </book>
</bookstore>
35
XPath Exapmles
<?xml version="1.0" encoding="UTF-8"?> <book category="web">
<title lang="en">XQuery Kick Start</title>
<bookstore> <author>James McGovern</author>
<author>Per Bothner</author>
<book category="cooking"> <author>Kurt Cagle</author>
<title lang="en">Everyday Italian</title> <author>James Linn</author>
<author>Giada De Laurentiis</author> <author>Vaidyanathan Nagarajan</author>
<year>2005</year> <year>2003</year>
<price>30.00</price> <price>49.99</price>
</book> </book>
</bookstore>
37