0% found this document useful (0 votes)
6 views

04-xml_XPath

XPath, or XML Path Language, is used to navigate and identify nodes in XML documents using path expressions and contains over 200 built-in functions. It is integral to various XML technologies such as XSLT and XQuery, and allows for the selection of nodes through various expressions and predicates. XPath also supports axes to define relationships between nodes and provides a wide range of functions for manipulating strings, numbers, and boolean values.

Uploaded by

khiatfaten2
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

04-xml_XPath

XPath, or XML Path Language, is used to navigate and identify nodes in XML documents using path expressions and contains over 200 built-in functions. It is integral to various XML technologies such as XSLT and XQuery, and allows for the selection of nodes through various expressions and predicates. XPath also supports axes to define relationships between nodes and provides a wide range of functions for manipulating strings, numbers, and boolean values.

Uploaded by

khiatfaten2
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 37

XML and the language XPath

Introduction
 XPath stands for XML Path Language
 XPath uses "path expressions" syntax to
identify and navigate nodes in an XML
document
 XPath contains over 200 built-in functions
 XPath is a major element in the XSLT
standard
 XPath is a W3C recommendation

2
Introduction
 XPath is used in other XML technologies:
◦ XML Schemas (expression of uniqueness and key
constraints),
◦ XSLT transforms,
◦ Xquery
◦ Xlink
◦ XPointer, etc.

 Unlike XML Schema, XPath is not an XML


language (uses another syntax)

3
XPath Terminology
 Nodes
◦ In XPath, there are seven kinds of nodes:
element, attribute, text,
namespace, processing-instruction,
comment, and root node.
◦ XML documents are treated as trees of nodes.
◦ The topmost element of the tree is called the root
element.

 Atomic values
◦ Atomic values are nodes with no children or parent.
 Items
◦ Items are atomic values or nodes.

4
Examples of Items
<?xml version="1.0" encoding="UTF-8"?>
<bookstore>
<book>
<title lang="en">Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
</bookstore>

<bookstore> (root element node)

<author>J K. Rowling</author> (element node)

lang="en" (attribute node)

J K. Rowling and "en" are Atomic values


5
XPath "Path Expressions"
 XPath uses path expressions to select
nodes or node-sets in an XML document.
 These path expressions look very much
like the path expressions you use with
traditional computer file systems:

6
Path Expression
A Path Expression is:
 A traversal of the document tree :
◦ from a starting node
◦ to a set of target nodes
◦ the targets constitute the value of the path
 Node sequence :
◦ T1.T2. ... .Tn
 Returns one or more nodes Tn, such that there are arcs:
◦ T1  T2, ... Tn-1  Tn,

db
db.Book.Author
Book Book
Author Author
Author Title Author A1 A2
A1 T1 A2 7
Relationship of Nodes
 Each element and attribute has one parent.
 the book element is the parent of the title,
author, year, and price

<bookstore>

<book>
<title>Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>

</bookstore>
8
Relationship of Nodes
 Element nodes may have zero, one or more
children.
 the title, author, year, and price elements are
all children of the book element

<bookstore>

<book>
<title>Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>

</bookstore>
9
Relationship of Nodes
 Nodes that have the same parent are
Siblings
 The title, author, year, and price elements are
all siblings

<bookstore>

<book>
<title>Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>

</bookstore>
10
Relationship of Nodes
 Node's parents, parent's parents, etc are
Ancestors
 The ancestors of the title element are the
book element and the bookstore element
<bookstore>

<book>
<title>Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>

</bookstore>
11
Relationship of Nodes
 Node's children, children's children, etc are
Descendants
 Descendants of the bookstore element are
the book, title, author, year, and price
elements
<bookstore>

<book>
<title>Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>

</bookstore>
12
Selecting Nodes
 XPath uses path expressions to select nodes in an
XML document.
 The node is selected by following a path or steps.
 The most useful path expressions are listed below:

Expression Description
nodename Selects all nodes with the name "nodename"
/ Selects from the root node
// Selects nodes in the document from the current node
that match the selection no matter where they are
. Selects the current node
.. Selects the parent of the current node
@ Selects attributes

13
Some path expressions and the result
Path Expression Result

bookstore Selects all nodes with the name "bookstore"

/bookstore Selects the root element bookstore


Note: If the path starts with a slash ( / ) it always represents
an absolute path to an element!

bookstore/book Selects all book elements that are children of bookstore

//book Selects all book elements no matter where they are in the
document
bookstore//book Selects all book elements that are descendant of the
bookstore element, no matter where they are under the
bookstore element

//@lang Selects all attributes that are named lang

14
Predicates
 Predicates are used to find a specific node
or a node that contains a specific value.
 Predicates are always embedded in square
brackets.

15
Some path expressions with predicates
and the result
Path Expression Result
/bookstore/book[1] Selects the first book element that is the child
of the bookstore element.
Note: In IE 5,6,7,8,9 first node is[0], but
according to W3C, it is [1]. To solve this
problem in IE, set the SelectionLanguage to
XPath:
In JavaScript:
xml.setProperty("SelectionLanguage","XPath");
/bookstore/book[last()] Selects the last book element that is the child
of the bookstore element

/bookstore/book[last()-1] Selects the last but one book element that is


the child of the bookstore element

/bookstore/book[position()<3] Selects the first two book elements that are


children of the bookstore element
16
Some path expressions with predicates
and the result (2)

Path Expression Result


//title[@lang] Selects all the title elements that have an
attribute named lang

//title[@lang='en'] Selects all the title elements that have a "lang"


attribute with a value of "en"

/bookstore/book[price>35.00] Selects all the book elements of the bookstore


element that have a price element with a value
greater than 35.00

/bookstore/book[price>35.00]/title Selects all the title elements of the book


elements of the bookstore element that have a
price element with a value greater than 35.00

17
Selecting Unknown Nodes
 XPath wildcards can be used to select unknown XML nodes.

Wildcard Description
* Matches any element node
@* Matches any attribute node
node() Matches any node of any kind

 some path expressions and their result

Path Expression Result

/bookstore/* Selects all the child element nodes of the bookstore


element
//* Selects all elements in the document
//title[@*] Selects all title elements which have at least one attribute
of any kind
18
Selecting Several Paths
 The use of the | operator in an XPath
expression allows to select several paths.

Path Expression Result


//book/title | //book/price Selects all the title AND price elements of all
book elements
//title | //price Selects all the title AND price elements in the
document
/bookstore/book/title | Selects all the title elements of the book
//price element of the bookstore element AND all
the price elements in the document

19
Location Path Expression
 A location path consists of one or more
steps, each separated by a slash
 A location path can be absolute or relative.
◦ An absolute location path starts with a slash "/"
◦ A relative location path does not.
An absolute location path:

/step/step/...

A relative location path:

step/step/...

20
Step
 Each step is evaluated against the nodes in
the current node-set.
 A step consists of:
◦ an axis (defines the tree-relationship between the
selected nodes and the current node)
◦ a node-test (identifies a node within an axis)
◦ zero or more predicates (to further refine the
selected node-set)
 The syntax for a location step is:
axisname::nodetest[predicate1]… [predicateN]

21
XPath Axes
 An axis represents a relationship to the context (current) node,
 It is used to locate nodes relative to the context node on the tree.
 Optional (by default child)

AxisName Result
ancestor Selects all ancestors (parent, grandparent, etc.) of the
current node
ancestor-or-self Selects all ancestors (parent, grandparent, etc.) of the
current node and the current node itself
attribute Selects all attributes of the current node
child Selects all children of the current node
descendant Selects all descendants (children, grandchildren, etc.) of the
current node

descendant-or-self Selects all descendants (children, grandchildren, etc.) of the


current node and the current node itself
22
XPath Axes (2)
AxisName Result
following Selects everything in the document after the
closing tag of the current node

following-sibling Selects all siblings after the current node

namespace Selects all namespace nodes of the current node

parent Selects the parent of the current node

preceding Selects all nodes that appear before the current


node in the document, except ancestors,
attribute nodes and namespace nodes

preceding-sibling Selects all siblings before the current node

self Selects the current node

23
Filtres

 A type of nodes that interests us in the


chosen axis (any nodes, any elements or a
specific element, comments, etc.)

 Mandatory, describes the subset of


nodes of the selected axis

24
Filtres
 Two ways to filter the nodes of an axis:
◦ By their name
 For nodes that have a name (Element, Attribute,
ProcessingInstruction)
 * : any name
◦ By their type
 text() : text nodes
 comment() : comment nodes
 processing-instruction() : ProcessingInstruction nodes
 node() : all node types

25
Predicates

 Are optional
 Describe additional filtering
 Conditions (combined by logical
operators) to be satisfied by the nodes
 Additional conditions for selecting nodes
among those retained by the filter in the
axis.

26
Predicates

 Boolean expression consisting of tests


connected by the logical operators and and
or
◦ Negation: by the not() function
 Test: elementary Boolean expression
◦ Comparison
◦ Boolean function call
◦ Path expression converted to Boolean
 Node set: false if the set is empty, otherwise true
27
Xpath examples with axes
Example Result
child::book Selects all book nodes that are children of the current
node
attribute::lang Selects the lang attribute of the current node

child::* Selects all element children of the current node

attribute::* Selects all attributes of the current node

child::text() Selects all text node children of the current node

child::node() Selects all children of the current node

descendant::book Selects all book descendants of the current node

ancestor::book Selects all book ancestors of the current node

ancestor-or-self::book Selects all book ancestors of the current node - and


the current as well if it is a book node
child::*/child::price Selects all price grandchildren of the current node

28
XPath Standard Functions
 XPath includes over 200 built-in functions.

 There are functions for string values,


numeric values, booleans, date and time
comparison, node manipulation, sequence
manipulation, and much more.

 Today XPath expressions can also be used in


JavaScript, Java, XML Schema, PHP, Python, C
and C++, and lots of other languages.

29
XPath Standard Functions
Many functions, here some of the most important:
 For Nodes
◦ count(expr): number of nodes in the set produced by the
expression (expr)
◦ name(): context node name
 local-name(), namespace-uri(): name components having a namespace

 For strings
◦ concat(ch1, ch2, …): concatenation
◦ contains(ch1, ch2): checks if ch1 contains ch2
◦ substring(ch, pos, l): extract from ch the substring of length l
starting at position pos (positions start at 1)
◦ string-length(ch): string length

30
XPath Standard Functions
 For Booleans
◦ true(), false(): true/false values
◦ not(expr): negation of logical expression

 For numerics
◦ floor(n), ceiling(n), round(n): rounding functions
rounded for node value
◦ sum(expr), avg(expr): sum, average of the numerical
values of the nodes of the set produced by the
expression (expr)
31
XPath Standard Functions
There are functions without parameters
but linked to the current node
 position : the number of the current node in
the list of considered nodes;
 last: the last node in the list of considered
nodes.

32
XPath Operators
 An XPath expression returns either a node-set, a string, a
Boolean, or a number.
Operator Description Example
| Computes two node-sets //book | //cd
+ / - / * / div Addition / Subtraction / Multiplication / Division 6 + 4 / 6 - 4 / 6 * 4 / 8 div 4
= Equal price=9.80
!= Not equal price!=9.80
< Less than price<9.80
<= Less than or equal to price<=9.80
> Greater than price>9.80
>= Greater than or equal to price>=9.80
or or price=9.80 or price=9.70
and and price>9.00 and price<9.90
mod Modulus (division remainder) 5 mod 2

33
 Source
https://www.w3schools.com/

34
XPath Exapmles
<?xml version="1.0" encoding="UTF-8"?> <book category="web">
<title lang="en">XQuery Kick Start</title>
<bookstore> <author>James McGovern</author>
<author>Per Bothner</author>
<book category="cooking"> <author>Kurt Cagle</author>
<title lang="en">Everyday Italian</title> <author>James Linn</author>
<author>Giada De Laurentiis</author> <author>Vaidyanathan Nagarajan</author>
<year>2005</year> <year>2003</year>
<price>30.00</price> <price>49.99</price>
</book> </book>

<book category="children"> <book category="web">


<title lang="en">Harry Potter</title> <title lang="en">Learning XML</title>
<author>J K. Rowling</author> <author>Erik T. Ray</author>
<year>2005</year> <year>2003</year>
<price>29.99</price> <price>39.95</price>
</book> </book>

</bookstore>

selects all the title nodes <title lang="en">Everyday Italian</title>


<title lang="en">Harry Potter</title>
/bookstore/book/title <title lang="en">XQuery Kick Start</title>
<title lang="en">Learning XML</title>

35
XPath Exapmles
<?xml version="1.0" encoding="UTF-8"?> <book category="web">
<title lang="en">XQuery Kick Start</title>
<bookstore> <author>James McGovern</author>
<author>Per Bothner</author>
<book category="cooking"> <author>Kurt Cagle</author>
<title lang="en">Everyday Italian</title> <author>James Linn</author>
<author>Giada De Laurentiis</author> <author>Vaidyanathan Nagarajan</author>
<year>2005</year> <year>2003</year>
<price>30.00</price> <price>49.99</price>
</book> </book>

<book category="children"> <book category="web">


<title lang="en">Harry Potter</title> <title lang="en">Learning XML</title>
<author>J K. Rowling</author> <author>Erik T. Ray</author>
<year>2005</year> <year>2003</year>
<price>29.99</price> <price>39.95</price>
</book> </book>

</bookstore>

/bookstore/book[1]/title <title lang="en">Everyday Italian</title>

/bookstore/book[2]/title <title lang="en">Harry Potter</title>

/bookstore/book[last()]/title <title lang="en">Learning XML</title>


36
XPath Exapmles
all the price nodes
/bookstore/book/price

all the price nodes with a price higher than 35


/bookstore/book[price>35]/price

all the book nodes with a price higher than 35


/bookstore/book[price>35]

all the title nodes with a price higher than 35


/bookstore/book[price>35]/title

37

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy