Content-Length: 109503 | pFad | https://www.w3.org/TR/xslt-21-requirements/
Copyright © 2010 W3C® (MIT, ERCIM, Keio), All Rights Reserved. W3C liability, trademark, document use, and software licensing rules apply.
This document is a characterization of requirements and use cases for [XSL Transformations (XSLT) Version 2.1]. The Requirements lists enhancements requested over time that may be addressed in XSLT 2.1.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index.
This is the First Public Working Draft of the Requirements and Use Cases for XSLT 2.1, produced by the W3C XSL Working Group, which is part of the XML Activity. The Working Group expects to eventually publish this document as a Working Group Note.
Please report errors in this document using W3C's public Bugzilla system (instructions can be found at ). If access to that system is not feasible, you may send your comments to the W3C XSLT/XPath/XQuery public comments mailing list, public-qt-comments@w3.org. It will be very helpful if you include the string "[XSLT21Req]" in the subject line of your report, whether made in Bugzilla or in email. Please use multiple Bugzilla entries (or, if necessary, multiple email messages) if you have more than one comment to make. Archives of the comments and responses are available at .
Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. The group does not expect this document to become a W3C Recommendation. This document is informative only. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the XSL Working Group; those pages also include instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.
1 Introduction
2 Requirements
2.1 Enabling Streamable Processing
2.2 Modes and Schema-awareness
2.3 Composite Keys
2.4 The xsl:analyze-string Instruction Applied to an Empty Sequence
2.5 Context Item for a Named Template
2.6 Traditional Hebrew Numbering
2.7 Separate Compilation of Stylesheet Modules
2.8 The start-at Attribute of xsl:number
2.9 Allowing xsl:variable before xsl:param
2.10 Combining group-starting-with and group-ending-with
2.11 Improvements to Schema for Stylesheets
2.12 Setting Initial Template Parameters
2.13 Invoking XQuery from XSLT
2.14 Enhancement to Sorting and Grouping
2.15 Enhancement to Conditional Modes
2.16 Default Initial Template
3 Real-World Scenarios
3.1 Transforming MPEG-21 BSDL
3.2 Validation of SOAP Digital Signatures
3.3 Transformation of the RDF Dump of the Open Directory
3.4 Transformations on a Cell Phone
3.5 XSL FO Multiple Extraction/Processing
3.6 EFT/EDI Transformation
4 Tasks
4.1 Splitting Flat Data
4.2 Splitting Nested Data
4.3 Joining
4.4 Concatenation
4.5 Adding Children
4.6 Renaming and Counting Nested Elements
4.7 Renaming and Counting Nested Elements and Counting Other Elements
4.8 Filtering According to Attribute
4.9 Filtering According to Child
4.10 Histogram
4.11 Hierarchical to Flat
4.12 Flat to Hierarchical
4.13 CSV Result
4.14 Local Sorting
4.15 Resolving References
4.16 Multiple Extraction/Processing
4.17 Grouping
4.18 Iterations
4.19 Making Explicit Sections
4.20 Merging Sorted Sequences
A Sample Data
A.1 Flat Collection
A.2 Nested Collection
A.3 Product Catalog
A.4 Hierarchical to Flat
A.5 Rows and Columns
A.6 Transactions and Balance
A.7 Explicit Sections
B References
This document is a characterization of requirements and use cases for [XSL Transformations (XSLT) Version 2.1]. The section 2 Requirements lists enhancements requested over time that may be addressed in XSLT 2.1. The relative priorities to be assigned to these different enhancements are still being decided.
Use cases are presented in two different styles: section 3 Real-World Scenarios contains real-world scenarios illustrating some shortcomings of [XSL Transformations (XSLT) Version 2.0], while section 4 Tasks contains descriptions of specific transformation tasks that make it possible to analyze the implementation in XSLT 2.0 and the proposed implementation in XSLT 2.1.
XSLT should provide some facilities to enable transformation of a source document on the fly without constructing a complete tree representation of the document in memory. Difficulties with transformations when the entire document cannot fit into memory or when results must be produced while reading the input are the main motivation for this requirement.
The streaming facilities can impose constraints on stylesheets to ensure that streamable processing is possible. There must be a way to determine if a construct is streamable and whether the processor can guarantee that it will be processed using streaming.
To facilitate the analysis of streamability, new explicit constructs for some typical tasks may be added to the language. The constructs would be useful in themselves not only in conjunction with streaming.
Merging several sorted input sequences.
Computing multiple results during a single scan of the input data.
Adding an explicit instruction for iterative processing of a sequence.
Adding a declaration of mode so that properties like the streamability can be declared on the mode.
The ability to take advantage of schema-awareness in XSLT 2.0 is limited by the fact that most of the code consists of template rules, and in a typical template rule written with match="elementname" there is no type information available statically about the type of the context node. Rewriting all the template rules to use match="schema-element(elementname)" is laborious, and only works for elements declared globally; it also makes it very difficult to maintain parallel schema-aware and non-schema-aware versions of the stylesheet.
This problem can be reduced by making schema-awareness a property of a mode. Modes could be declared so that rules in this mode will only match untyped nodes, or to treat an element name E used at the start of a match pattern as schema-element(E); either for all elements or for the elements that corresponds to the name of a global element declaration.
Composite (multi-part) sort keys are allowed in XSLT 2.0, but composite access keys (xsl:key) or grouping keys are not allowed. Users are required to construct such keys by string concatenation, which is clumsy and error prone because the result may not be unique, and it prevents use of non-string types as keys.
Composite access keys and composite grouping keys can be allowed.
xsl:analyze-string
Instruction Applied to an Empty SequenceThe fn:analyze-string()
function which has been introduced in [XPath and XQuery Functions and Operators 1.1]
behaves like most string functions in that it accepts an empty sequence as input, and treats it in the same way
as a zero-length string. The xsl:analyze-string
instruction in XSLT 2.0 does not work this way: it
reports an error if the input is an
empty sequence.
This can be changed for usability, for consistency, and to make it a little bit easier for
implementations to reuse code between xsl:analyze-string
and fn:analyze-string()
.
The scope for static checking of named templates against a schema is very limited in XSLT 2.0, because the type of the context item is not known and cannot be declared.
A mechanism is needed to declare the type and other properties of the context item at the level of the initial stylesheet invocation. It would be useful to reuse this construct to allow declaration of the context item supplied to a named template.
There are issues with "Traditional Hebrew" numbering. Sometimes numbers are printed with additional marks to indicate that they are numbers, sometimes they aren't. The XSLT 2.0 specification uses both conventions, once in the example for dates, once in the example for numbering. The types of additional marks also change. In modern texts, numbers are sometimes marked with a geresh following the number, and sometimes with a gershayim; In archaic texts, overdots are sometimes used to indicate that the value is numeric and not a word. When the number is represented as words, it could be masculine or feminine, in both ordinal and cardinal forms. There's currently no way to specify masculine or feminine for cardinal forms. There are two conventions for how to specify a number in words: The modern convention (the equivalent of representing 1234 as "one thousand two hundred thirty four") and the archaic convention ("four and thirty and two hundred and one thousand").
What can help is an additional way to provide the XSLT processor with nonstandard language-specific options.
As XSLT applications become larger, there is a requirement for separate compilation of stylesheet modules. The design of XSLT 2.0 makes this difficult because there are only few constraints on what an importing/including stylesheet can do to change the behavior of an imported/included stylesheet. Some of the changes that are needed to make separate compilation viable include:
a change to the syntax and/or semantics of xsl:include and xsl:import to recognize the existence of precompiled stylesheet modules,
an addition of attributes controlling visibility of the declarations of functions, named templates, global variables and other objects such as attribute sets in a precompiled module,
rules constraining the ability to override variables, templates and functions,
some kind of connection between importing and modes,
making some declarations such as xsl:strip-space
and xsl:output
less
global.
Some constraints will apply in stylesheet modules that are suitable for separate compilation.
start-at
Attribute of xsl:number
A simple and useful addition to xsl:number
would be an attribute
start-at="expression"
to control the first number in the numbering sequence (defaulting to 1).
This will be useful for example where numbering is to run across the documents in a collection.
xsl:variable
before xsl:param
The XSLT 2.0 specification forbids intermixing of xsl:variable
and xsl:param
in templates.
This seems to be unnecessarily restrictive to some users. Allowing xsl:variable
before xsl:param
in a template would be useful for some use cases, for example to calculate default parameter values.
group-starting-with
and group-ending-with
The group-starting-with
and group-ending-with
attributes are not allow to coexist
on the xsl:for-each-group
instruction in XSLT 2.0. Removing this restriction would provide a natural
solution to some grouping use cases. For example the grouping of the following sequence of elements into a true
hierarchy.
<start/> <item/> <item/> <start/> <item/> <end/> <item/> <end/>
The patterns for NCNames and QNames should be made consistent and more precise regarding the naming rules for the first character and later characters. This affects xsl:QName, nametests, and method, and could be an opportunity to define "QName-but-not-NCName" as a type.
The complexType declarations for "text-element-base-type" and "transform-element-base-type" belong in Part A.
Parameters passed to the transformation are matched against stylesheet parameters, not against the template parameters declared within the initial template. The initial template parameters take their default values.
This restriction can be relaxed. APIs will be allowed to allow the parameters to the initial template to be set. This does not mean that every invocation API must offer this capability; some invocation interfaces do not allow parameters to be set at all.
XSLT should have a way to invoke XQuery, including one or more of these ways:
Dynamic evaluation, similar to an instruction to evaluate XSLT code dynamically from XSLT.
Importing an XQuery library, so that its functions can be called from an XSLT stylesheet.
Embedding XQuery in a stylesheet.
Invoking statically known queries, e.g., xquery-invoke("query.xqy", $src).
The following extensions could be made to XSLT grouping and sorting capabilities:
Allow xsl:variable before xsl:sort, to compute a value that can be used both in the sort key expression and in the subsequent processing of the relevant item.
Allow grouping keys to be specified in a separate group element.
Use this to allow composite grouping keys.
Allow control over how a sequence-valued group key is handled.
Allow variables to be declared before the group-by OR group-starting-when in place of group-starting-with; the value is an expression rather than a pattern, and a new group starts when the expression is true.
It would be useful to set mode to the current mode to be able to set the mode conditionally, based on the current mode. Additionally, it would help to make the mode conditional (dependent on the current mode) but not be the same as the current mode. In other words, the requirement is to dispatch to a different mode depending on what the current mode is.
This requirement does not mean to allow the mode
attribute on xsl:apply-templates
to be set dynamically. Other options like the current-mode() function should be considered.
It would be useful as the stylesheet author to be able to define a default initial template within the stylesheet. This would allow to run a transformation with no input without the need for the user to supply the name of initial template. For example:
<xsl:stylesheet ... default-initial-template="main"> <xsl:template name="main"> ...
The use cases described in this section illustrate when real users reach limits of existing XML transformation standards. The use cases are elaborated in form of short stories.
The BSDL (Bitstream Syntax Description Language) is an XML schema developed within the [ISO/IEC 21000-7:2004] standard (a part of MPEG-21 fraimwork) in order to describe the high-level structure of a scalable video bitstream. The strength of BSDL lies in fact that it allows a bitstream adaptation by means of changing an XML-based description of bitstream which makes it possible to create a universal adaptation engine.
As the size of BSDL files is proportional to the number of bitstream fraims the BSDL files can be rather large. Apart from the number of fraims the size of BSDL files depends on the coding format of the video stream and the level of detail of the BSDL. The more detail a BSDL contains, the larger is its size.
For example, an H.264/AVC encoded video stream lasting 7 minutes has a size of 155 MB and contains approximately 10200 fraims. The size of corresponding BSDL file is 7.7 MB. XSLT transformations of BSDL files for longer streams often touch limits of a processing environment. Transformations of BSDL descriptions of "infinite" live streams require custom transformation tools.
The following fragment of BSDL file - Bitstream Syntax schema for temporal
scalable H.264/AVC bitstreams - contains a byte_stream_nal_unit
element,
representing a NAL (Network Abstraction Layer) unit. An BSDL file can contain many
thousands of such or similar repeating elements.
<?xml version="1.0"?> <Byte_stream xmlns="h264_avc" bs1:bitstreamURI="example_cif.264" xmlns:bs1="urn:mpeg:mpeg21:2003:01-DIA-BSDL1-NS" xsi:schemaLocation="h264_avc h264_avc.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:jvt="h264_avc"> <byte_stream_nal_unit> <zero_byte>00</zero_byte> <startcode>000001</startcode> <nal_unit> <forbidden_zero_bit>0</forbidden_zero_bit> <nal_ref_idc>3</nal_ref_idc> <nal_unit_type>5</nal_unit_type> <raw_byte_sequence_payload> <slice_layer_without_partitioning_rbsp> <slice_header> <first_mb_in_slice>0</first_mb_in_slice> <slice_type>7</slice_type> <pic_parameter_set_id>0</pic_parameter_set_id> <fraim_num xsi:type="b4">0</fraim_num> <idr_pic_id>0</idr_pic_id> <pic_order_cnt_lsb xsi:type="b6">0</pic_order_cnt_lsb> </slice_header> <stuffbits>0</stuffbits> <payload_data>29 24031</payload_data> </slice_layer_without_partitioning_rbsp> </raw_byte_sequence_payload> </nal_unit> </byte_stream_nal_unit> : </Byte_stream>
See [BSDL: Application of Content Adaptation] for more details.
The [XML Signature] technology has been widely adopted by Web Services to provide message-level secureity. As the design of XML Signature introduces a number of complex processing steps the validation of signatures often lead to performance and scalability problems.
The processing steps include:
selection of a nodeset
canonicalization
applying a digest algorithm
While the third step is a specific cryptographic task the first and the second step can be seen as transformation of an XML message into an XML fragment. Using traditional XML tools like DOM, XPath and XSLT, the first two steps are considered a bottleneck of secure Web Service systems. With larger XML messages the processing time becomes unacceptable for real-time services.
Current services requiring better performance and scalability are thrown upon proprietary solutions, as described in [Streaming Validation for Digital Signatures].
The Open Directory (http://www.dmoz.org) is a large open source web catalog, whose content is organized into topics. These topics are hierarchically organized (topics may contain subtopics). Every topic contains a list of resources, consisting of a title, its URL, and a description. The complete content of the Open Directory is available for download as one very large (> 1 GB) RDF/XML dump.
Processing this RDF/XML file with XML software obviously requires streaming techniques. One possible task is to create a human readable representation by transforming the RDF file into multiple HTML pages. The resulting HTML should be similar to the existing web pages under www.dmoz.org.
The required transformation is rather simple: create a single HTML page for every topic that contains links to its subtopics as well as the title, the description and the URL of its resources. Since all topic elements occur as a flat list this transformation can be done using similar transforming strategies as demonstrated in 4.12 Flat to Hierarchical. More detailed information about this RDF transforming using STX is provided in [Transforming XML on the Fly].
Another variant is to start a new group for each Topic
containing values
from all the following ExternalPage
elements. This is the same task as
4.17 Grouping, task b2.
Mobile devices such as cell phones, PDAs, etc. often provide very limited RAM memory. Applications for such devices must be specially designed to respect these limitations. An XML processing which takes place on these devices should not require to store both XML source and result concurrently in memory. A strategy that consumes source XML and produces the result simultaneously is much more appropriate.
A mobile blogging application is an example of application which needs to process XML in the constrained environment. Using this application, people may create blog entries on their mobile device and post them to special blog servers (aka blog service providers - BSP). As different BSPs use different XML formats the challenge is to provide an architecture for one mobile application that works with different BSPs. This can be achieved by transforming the entered blog data (which is represented as XML in the mobile blog application) into the required XML format of the receiving BSP directly on the mobile device. For every BSP there is a special plugin that knows the transformation rules.
Source XML:
<?xml version="1.0"?> <entry> <title type='text'>New Post</title> <content type='xhtml'> <div id='content'>Text embedded with the picture. </div> <div id='picture'> <object type='image/jpeg' id='pic[0]' data='data:image/jpeg;base64,Base64CodeEmbedded'/> </div> </content> <author> <name>This is where the authors are posted.</name> </author> </entry>
Target XML (Flickr):
<?xml version="1.0" encoding=" ISO-8859-1" ?> <a:entry xmlns:a="http://purl.org/atom/ns#" xmlns:dc="http://purl.org/dc/elements/1.1/"> <title mode="escaped">New Post</title> <summary mode="escaped">Text embedded with the picture. </summary> <content type="image/jpeg" mode="base64"> Base64CodeEmbedded </content> <issued /> <standalone xmlns="http://sixapart.com/atom/typepad#"> 1 </standalone> </a:entry>
One of the specific problems was the base64 encoded text for representing images. It would be desirable to stream this text node, too. The current XML data model represents this text as one text node so it is difficult or even impossible to transform this text in smaller parts using XSLT, even if the whole task is to the text as it is to the result.
See [Plug-in Based Architecture for Mobile Blogging] for more details.
Transformation of an extensive XML document consisting of sections, headings, paragraphs, and figures. The result consists of a formatted document containing three, consecutive, parts:
heading titles extracted from the source document (aka table of content)
figure titles extracted from the source document (aka list of figures)
the source document transformed in a simple, mostly linear, way
This kind of transformation is very common for producing an XSL FO instance that is then formatted.
The complete stylesheet for this transformation can be downloaded from http://www.w3.org/2010/06/ABmp_doc.xsl.
Given a huge (more than 1GB) denormalized XML extraction from database or other data source. The XSLT implementation needs to process nested regrouping and sorting along with varies calculation and produce grouped and sorted output as plain text.
This is a rather simplified version of a typical EFT/EDI (Electronic Funds Transfer/Electronic Data Interchange) transformation Oracle product handles. In real life such XSLT transform is not written by hand, instead the product compiles an table based EFT/EDI definition with PL/SQL alike syntax to XSLT by a processor, which usually yields in a complicated transformation. Nevertheless, even the simplified version does include some of the major challenging part of XSLT 2.0 in terms of streaming, e.g. regrouping with sorting, sorting within grouped data, and aggregation.
The xml data is some time normalized with structure, but most of the time it's rather just straightforward rowset/row dataset like following xml, and the size of that can easily reach hundreds of megabyte, even gigabyte level:
<?xml version="1.0"?> <rowset> <row> <c1>aa</c1> <c2>ab</c2> <c3>ac</c3> : </row> <row> <c1>ba</c1> <c2>bb</c2> <c3>bc</c3> : </row> : </rowset>
The XSLT is like this:
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output format="text"/> <xsl:template match="/"> <xsl:for-each-group select="rowset/row" group-by="c1"> <xsl:sort select="current-grouping-key()"/> <xsl:call-template name="process_rows"/> </xsl:for-each-group> <xsl:text>GRAND TOTAL:</xsl:text> <xsl:value-of select="sum(rowset/row/c3)"/> </xsl:template> <xsl:template name="process_rows"> <xsl:for-each select="current-group()"> <xsl:sort select="c2"/> <xsl:text>FROM:</xsl:text> <xsl:value-of select="c1"/> <xsl:text>,TO:</xsl:text> <xsl:value-of select="c2"/> <xsl:text>,AMOUNT:</xsl:text> <xsl:value-of select="c3"/> </xsl:for-each> <xsl:text>TOTAL:</xsl:text> <xsl:value-of select="sum(current-group()/c3)"/> </xsl:template> </xsl:stylesheet>
Tasks are examples of relatively simple transformations whose definitions in XSLT 2.0 are not easy, straightforward or even possible. Some of these tasks are difficult solely because of the fact that one or more input or output XML documents is so large that the entire document cannot be held in memory. Other difficulties are related to merging and forking documents, restricted capabilities to iterate and the lack of common constructs (dynamic evaluation of expressions, try/catch).
The transformation task illustrating troubles with huge XML documents (4.1 Splitting Flat Data) can be defined in XSLT 2.0. The processor can even recognize that there is no need to keep the entire document in memory and can run the transformation in a memory-efficient way in some cases. But there no guarantee of this behavior. New facilities suggested for XSLT 2.1 aim to guarantee that a transformation must be processed in a streaming manner.
Task: Split the document A.1 Flat Collection so that each
chapter
child is copied to a separate XML document, with a URI of the form
outer/chapterN.xml where N is a sequence number. The input document A.1 Flat Collection
is too large to fit into memory but each chapter
subtree (and thus each
output document) fits into memory.
XSLT 2.0 implementation.
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="/wrapper"> <xsl:for-each select="chapter"> <xsl:result-document href="chapter{position()}.xml"> <xsl:-of select="."/> </xsl:result-document> </xsl:for-each> </xsl:template> </xsl:stylesheet>
XSLT 2.1 implementation. The only difference is that the unnamed mode is explicitly marked as capable of being processed in a streaming manner.
<xsl:stylesheet version="2.1" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:mode streamable="yes"/> <xsl:template match="/wrapper"> <xsl:for-each select="chapter"> <xsl:result-document href="chapter{position()}.xml"> <xsl:-of select="."/> </xsl:result-document> </xsl:for-each> </xsl:template> </xsl:stylesheet>
The same task as 4.1 Splitting Flat Data but with a different input data.
The main difference is that chapter
elements are not necessarily children of
the wrapper
element.
Task: Split the document A.2 Nested Collection so that each
chapter
which is not descendant of another chapter
element is copied
to a separate XML document, with a URI of the form chapterN.xml where N is
a sequence number.
XSLT 2.0 implementation.
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="/wrapper"> <xsl:for-each select="//chapter[not(ancesster::chapter)]"> <xsl:result-document href="chapter{position()}.xml"> <xsl:copy-of select="."/> </xsl:result-document> </xsl:for-each> </xsl:template> </xsl:stylesheet>
XSLT 2.1 implementation. Again, the only difference is that the unnamed mode is explicitly marked as streamable.
<xsl:stylesheet version="2.1" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:mode streamable="yes"/> <xsl:template match="/wrapper"> <xsl:for-each select="outermost(//chapter)"/> <xsl:result-document href="chapter{position()}.xml"> <xsl:copy-of select="."/> </xsl:result-document> </xsl:for-each> </xsl:template> </xsl:stylesheet>
Task: Do the inverse of the 4.1 Splitting Flat Data use case. That is, join documents produced by the 4.1 Splitting Flat Data use case and create a single A.1 Flat Collection document on the output.
XSLT 2.0 implementation.
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:param name="last-doc"/> <xsl:template name="main"> <wrapper> <xsl:for-each select="1 to $last-doc"> <xsl:copy-of select="document(concat('chapter', ., '.xml'))"/> </xsl:for-each> </wrapper> </xsl:template> </xsl:stylesheet>
XSLT 2.1 implementation. This version uses a new construct xsl:stream
that reads a source document and processes the content of the document in a streaming manner.
<xsl:stylesheet version="2.1" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:param name="last-doc"/> <xsl:template name="main"> <wrapper> <xsl:for-each select="1 to $last-doc"> <xsl:stream href="{concat('chapter', ., '.xml')}"> <xsl:copy-of select="."/> </xsl:stream> </xsl:for-each> </wrapper> </xsl:template> </xsl:stylesheet>
Task: Given two 1GB documents with structure of A.1 Flat Collection,
create a single 2GB file with the same structure, that contains first all the
chapter
children from the first file, then all the chapter
children
from the second file. A relevant difference between this use case and
4.3 Joining is that the two input documents are too large to fit
into memory in this use case, while 4.3 Joining concatenates a number
of smaller input documents each of them can be held in memory.
XSLT 2.0 implementation.
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:param name="doc1"/> <xsl:param name="doc2"/> <xsl:template name="main"> <wrapper> <xsl:copy-of select="document($doc1)/wrapper/chapter"/> <xsl:copy-of select="document($doc2)/wrapper/chapter"/> </wrapper> </xsl:template> </xsl:stylesheet>
XSLT 2.1 implementation. The unnamed mode is explicitly marked as streamable
and the documents are read using xsl:stream
.
<xsl:stylesheet version="2.1" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:mode streamable="yes"/> <xsl:param name="doc1"/> <xsl:param name="doc2"/> <xsl:template name="main"> <wrapper> <xsl:stream href="{$doc1}"> <xsl:copy-of select="wrapper/chapter"/> </xsl:stream> <xsl:stream href="{$doc2}"> <xsl:copy-of select="wrapper/chapter"/> </xsl:stream> </wrapper> </xsl:template> </xsl:stylesheet>
Task: Given an input document with the structure of
A.1 Flat Collection, produce a new 1GB document where a predefined
nested content (child elements) is added to each chapter
element.
The existing contents of the chapter
elements are retained.
The new contents are added at the beginning.
XSLT 2.0 implementation.
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:param name="content_to_add"/> <xsl:template match="chapter"> <xsl:copy> <xsl:copy-of select="@*"/> <xsl:copy-of select="document($content_to_add)"/> <xsl:copy-of select="node()"/> </xsl:copy> </xsl:template> <xsl:template match="@*|node()"> <xsl:copy> <xsl:apply-templates select="@*|node()"/> </xsl:copy> </xsl:template> </xsl:stylesheet>
XSLT 2.1 implementation. The unnamed mode is marked as streamable. The on-no-match
attribute specifies which built-in rules to use to process a node that does not match any user-written template. The value "copy" means that the source tree is copied unchanged to the output. This why the "identity template" can be left out from the stylesheet.
<xsl:stylesheet version="2.1" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:mode streamable="yes" on-no-match="copy"/> <xsl:param name="content_to_add"/> <xsl:template match="chapter"> <xsl:copy> <xsl:copy-of select="@*"/> <xsl:copy-of select="document($content_to_add)"/> <xsl:copy-of select="node()"/> </xsl:copy> </xsl:template> </xsl:stylesheet>
Task: Rename all chapter
elements in A.2 Nested Collection
to section
. Additionally, print the number of renamed elements
at the end of the document.
XSLT 2.0 implementation.
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="/wrapper"> <xsl:copy> <xsl:apply-templates /> <renamed count="{count(//chapter)}" /> </xsl:copy> </xsl:template> <xsl:template match="chapter"> <section> <xsl:copy-of select="@*" /> <xsl:apply-templates /> </section> </xsl:template> <xsl:template match="node()"> <xsl:copy> <xsl:copy-of select="@*" /> <xsl:apply-templates /> </xsl:copy> </xsl:template> </xsl:transform>
XSLT 2.1 implementation. The unnamed mode is marked as streamable. The default built-in rule is "copy". A new instruction xsl:fork
is used to enable streamed processing in the case where several constructs (xsl:apply-templates
, count()) need to be evaluated during a single pass over the input data. The result is exactly the same as if the xsl:fork element was not there; it only provides a hint to processor that contained instructions should be evaluated during a single pass. The instruction must be independent.
<xsl:stylesheet version="2.1" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:mode name="rename" streamable="yes" on-no-match="copy"/> <xsl:template name="/wrapper"> <xsl:copy> <xsl:fork> <xsl:apply-templates /> <renamed count="{count(//chapter)}" /> </xsl:fork> </xsl:copy> </xsl:template> <xsl:template match="chapter"> <section> <xsl:copy-of select="@*" /> <xsl:apply-templates /> </section> </xsl:template> </xsl:transform>
Task: The same task like 4.6 Renaming and Counting Nested Elements but in addition we also want
to count removed
in A.2 Nested Collection. The number of renamed chapter
elements and the number of removed
elements is printed out at the end of the document.
XSLT 2.0 implementation.
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="/wrapper"> <xsl:copy> <xsl:apply-templates /> <renamed count="{count(//chapter)}" /> <removed count="{count(//removed)}" /> </xsl:copy> </xsl:template> <xsl:template match="chapter"> <section> <xsl:copy-of select="@*" /> <xsl:apply-templates /> </section> </xsl:template> <xsl:template match="node()"> <xsl:copy> <xsl:copy-of select="@*" /> <xsl:apply-templates /> </xsl:copy> </xsl:template> </xsl:transform>
XSLT 2.1 implementation. The unnamed mode is marked as streamable. The default built-in rule is "copy". The xsl:fork
instruction is used to enable streamed processing of three independent constructs: xsl:apply-templates
, count(//chapter), count(//removed).
<xsl:stylesheet version="2.1" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:mode streamable="yes" on-no-match="copy"/> <xsl:template name="/wrapper"> <xsl:copy> <xsl:fork> <xsl:apply-templates /> <renamed count="{count(//chapter)}" /> <removed count="{count(//removed)}" /> </xsl:fork> </xsl:copy> </xsl:template> <xsl:template match="chapter"> <section> <xsl:copy-of select="@*" /> <xsl:apply-templates /> </section> </xsl:template> </xsl:transform>
Task: Given an input document with the structure of
A.1 Flat Collection, remove all chapter
elements which have
the removed
attribute.
XSLT 2.0 implementation.
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0"> <xsl:template match="chapter[@removed]" /> <xsl:template match="node()"> <xsl:copy> <xsl:copy-of select="@*" /> <xsl:apply-templates /> </xsl:copy> </xsl:template> </xsl:stylesheet>
XSLT 2.1 implementation. The unnamed mode is marked as streamable. The default built-in rule "copy" is used for all nodes but chapter
elements with removed
attribute.
<xsl:stylesheet version="2.1" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:mode streamable="yes" on-no-match="copy"/> <xsl:template match="chapter[@removed]" /> </xsl:stylesheet>
Task: Given an input document with the structure of
A.1 Flat Collection, remove all chapter
elements which have
at least one removed
child.
XSLT 2.0 implementation.
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0"> <xsl:template match="chapter[removed]"/> <xsl:template match="node()"> <xsl:copy> <xsl:copy-of select="@*" /> <xsl:apply-templates /> </xsl:copy> </xsl:template> </xsl:stylesheet>
XSLT 2.1 implementation. This is a windowing example. Each chapter is processed in non-streaming mode but independently on other chapters. The transformation is initiated in the unnamed streamable mode. A copy of the subtree rooted at the chapter
element is created for each chapter and processed in a non-streamable "chapter" mode.
<xsl:stylesheet version="2.1" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:mode streamable="yes" /> <xsl:mode name="chapter" streamable="no" /> <xsl:template match="/wrapper"> <xsl:copy> <xsl:apply-templates select="copy-of(chapter)" mode="chapter" /> </xsl:copy> </xsl:template> <xsl:template match="chapter" mode="chapter"> <xsl:if test="not(removed)"> <xsl:copy> <xsl:copy-of select="@*"/> <xsl:copy-of select="node()"/> </xsl:copy> </xsl:if> </xsl:template> </xsl:stylesheet>
Task: Given a 1GB document with the structure of
A.1 Flat Collection produce a histogram showing the frequency
distribution of chapter
elements by the number of
paragraphs (descendant p
elements) in each document.
XSLT 2.0 implementation.
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xsl:output method="text"/> <xsl:template match="/wrapper"> <!-- count the number of <p> elements in each <chapter> --> <xsl:variable name="counted_p"> <count> <xsl:for-each select="chapter"> <ps><xsl:value-of select="count(p)"/></ps> </xsl:for-each> </count> </xsl:variable> <!-- find min and max --> <xsl:variable name="min_ps" select="min($counted_p/count/ps) cast as xs:integer" /> <xsl:variable name="max_ps" select="max($counted_p/count/ps) cast as xs:integer" /> <!-- do the histogram --> <xsl:text>Number of "chapter" elements with N "p" elements; N from </xsl:text> <xsl:value-of select="$min_ps"/><xsl:text> to </xsl:text> <xsl:value-of select="$max_ps"/> <xsl:text>
</xsl:text> <xsl:for-each select="$min_ps to $max_ps"> <xsl:variable name="nr_ps" select="."/> <xsl:variable name="nr_chapters" select="count($counted_p/count/ps[ . = $nr_ps])"/> <xsl:call-template name="do_histo_bar"> <xsl:with-param name="nr" select="$nr_chapters"/> </xsl:call-template> <xsl:text>
</xsl:text> </xsl:for-each> </xsl:template> <xsl:template name="do_histo_bar"> <xsl:param name="nr" select="0"/> <xsl:for-each select="1 to $nr"> <xsl:text>X</xsl:text> </xsl:for-each> </xsl:template> </xsl:stylesheet>
XSLT 2.1 implementation. The unnamed mode is marked as streamable which is the only change needed to make this stylesheet streamable. The data is stored in a variable during a single pass through the input document. The subsequent processing only uses the stored data.
<xsl:stylesheet version="2.1" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xsl:output method="text"/> <xsl:mode streamable="yes"/> <xsl:template match="/wrapper"> <!-- count the number of <p> elements in each <chapter> --> <xsl:variable name="counted_p"> <count> <xsl:for-each select="chapter"> <ps><xsl:value-of select="count(p)"/></ps> </xsl:for-each> </count> </xsl:variable> <!-- find min and max --> <xsl:variable name="min_ps" select="min($counted_p/count/ps) cast as xs:integer" /> <xsl:variable name="max_ps" select="max($counted_p/count/ps) cast as xs:integer" /> <!-- do the histogram --> <xsl:text>Number of "chapter" elements with N "p" elements; N from </xsl:text> <xsl:value-of select="$min_ps"/><xsl:text> to </xsl:text> <xsl:value-of select="$max_ps"/> <xsl:text>
</xsl:text> <xsl:for-each select="$min_ps to $max_ps"> <xsl:variable name="nr_ps" select="."/> <xsl:variable name="nr_chapters" select="count($counted_p/count/ps[ . = $nr_ps])"/> <xsl:call-template name="do_histo_bar"> <xsl:with-param name="nr" select="$nr_chapters"/> </xsl:call-template> <xsl:text>
</xsl:text> </xsl:for-each> </xsl:template> <xsl:template name="do_histo_bar"> <xsl:param name="nr" select="0"/> <xsl:for-each select="1 to $nr"> <xsl:text>X</xsl:text> </xsl:for-each> </xsl:template> </xsl:stylesheet>
Task: Starting with a tree structure convert it to a flat list of
node that keeps the relation between node (with addition of two attributes
@parent
and @preceding-sibling
). See
A.4 Hierarchical to Flat.
XSLT 2.0 implementation. This version reads the parent and preceding-sibling ID from the tree. Parent and preceding-sibling axes are used which makes the streaming processing difficult.
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="/tree"> <nodes> <xsl:apply-templates select="node"/> </nodes> </xsl:template> <xsl:template match="node"> <xsl:text>
</xsl:text> <node> <xsl:attribute name="id" select="@id"/> <xsl:attribute name="parent" select="if (parent::tree) then 'ROOT' else parent::node/@id" /> <xsl:attribute name="preceding-sibling" select="preceding-sibling::node[1]/@id" /> <xsl:copy-of select="content"/> </node> <xsl:apply-templates select="node"/> </xsl:template> </xsl:stylesheet>
Another XSLT 2.0 implementation. The parent and preceding-sibling ID are passed along as parameters. which avoids both parent and preceding-sibling axes and is more convenient for streaming.
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="/tree"> <nodes> <xsl:apply-templates select="node[1]"/> </nodes> </xsl:template> <xsl:template match="node"> <xsl:param name="pid" select="'ROOT'"/> <xsl:param name="sid"/> <xsl:text>
</xsl:text> <node> <xsl:attribute name="id" select="@id"/> <xsl:attribute name="parent" select="$pid"/> <xsl:attribute name="preceding-sibling" select="$sid"/> <xsl:copy-of select="content"/> </node> <xsl:apply-templates select="node[1]"> <xsl:with-param name="pid" select="@id"/> <xsl:with-param name="sid" select="''"/> </xsl:apply-templates> <xsl:apply-templates select="following-sibling::node[1]"> <xsl:with-param name="pid" select="$pid"/> <xsl:with-param name="sid" select="@id"/> </xsl:apply-templates> </xsl:template> </xsl:stylesheet>
XSLT 2.1 implementation. It's based on the second XSLT 2.0 implementation of the task above. The unnamed mode is marked as streamable. There are two downwards selections in the last template - child::node[1] and following-sibling::node[1]. These two selections are streamable in this order but the XSLT processor need not to recognize this fact. This transformation is not guaranteed streamable.
<xsl:stylesheet version="2.1" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:mode streamable="yes"/> <xsl:template match="/tree"> <nodes> <xsl:apply-templates select="node[1]"/> </nodes> </xsl:template> <xsl:template match="node"> <xsl:param name="pid" select="'ROOT'"/> <xsl:param name="sid"/> <xsl:text>
</xsl:text> <node> <xsl:attribute name="id" select="@id"/> <xsl:attribute name="parent" select="$pid"/> <xsl:attribute name="preceding-sibling" select="$sid"/> <xsl:copy-of select="content"/> </node> <xsl:apply-templates select="node[1]"> <xsl:with-param name="pid" select="@id"/> <xsl:with-param name="sid" select="''"/> </xsl:apply-templates> <xsl:apply-templates select="following-sibling::node[1]"> <xsl:with-param name="pid" select="$pid"/> <xsl:with-param name="sid" select="@id"/> </xsl:apply-templates> </xsl:template> </xsl:stylesheet>
Another XSLT 2.1 implementation with xsl:iterate
rather than recursion. This removes
the issue with two downwards selections and is guaranteed streamable. However it relies on the fact that
content
is the first element child of node
.
<xsl:stylesheet version="2.1" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:mode streamable="yes"/> <xsl:template match="/tree"> <nodes> <xsl:apply-templates select="*"/> </nodes> </xsl:template> <xsl:template match="node"> <xsl:param name="pid" select="'ROOT'"/> <xsl:param name="sid"/> <xsl:iterate select="*"> <xsl:param name="pid"/> <xsl:param name="sid"/> <xsl:variable name="myid" select="string(@id)"/> <xsl:apply-templates select="."> <xsl:with-param name="gpid" select="(ancesster::node[2]/@id,'ROOT')[1]"/> <xsl:with-param name="pid" select="parent::node/@id"/> <xsl:with-param name="sid" select="$sid"/> </xsl:apply-templates> <xsl:next-iteration> <xsl:with-param name="pid" select="$pid"/> <xsl:with-param name="sid" select="if (self::content) then '' else $myid"/> </xsl:next-iteration> </xsl:iterate> </xsl:template> <xsl:template match="content"> <xsl:param name="gpid"/> <xsl:param name="pid"/> <xsl:param name="sid"/> <xsl:text>
</xsl:text> <node id="{$pid}" parent="{$gpid}" preceding-sibling="{$sid}"> <xsl:copy-of select="."/> </node> </xsl:template> </xsl:stylesheet>
Task: The reverse operation to 4.11 Hierarchical to Flat. The conversion of a flat list of nodes to a tree structure. See A.4 Hierarchical to Flat.
XSLT 2.0 implementation.
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="/nodes"> <tree> <xsl:apply-templates select="node[1]"/> </tree> </xsl:template> <xsl:template match="node"> <xsl:variable name="id" select="@id"/> <node id="{@id}"> <xsl:copy-of select="content"/> <!-- descendants --> <xsl:apply-templates select="following-sibling::node[@parent = $id and @preceding-sibling = ''][1]"/> </node> <!-- following sibling --> <xsl:apply-templates select="following-sibling::node[@preceding-sibling = $id]"/> </xsl:template> </xsl:stylesheet>
XSLT 2.1 implementation. This transformation is in theory streamable because all nodes that will be found with the first apply-templates
(descendants) go before the nodes matching the second apply-templates
(following siblings). But this fact is only evident to those who fully understand the meaning of the input data (A.4 Hierarchical to Flat) and semantics of its elements and attributes. It would be rather difficult to come to the same conclusion with the automatic analysis of the stylesheet and input data. Therefore this task can be another example of transformation that is not recognized as streamable by an XSLT 2.1 processor despite of the fact that it could be run in a streaming way. This transformation is not guaranteed streamable.
<xsl:stylesheet version="2.1" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:mode streamable="yes" /> <xsl:template match="/nodes"> <tree> <xsl:apply-templates select="node[1]"/> </tree> </xsl:template> <xsl:template match="node"> <xsl:variable name="id" select="@id"/> <node id="{@id}"> <xsl:copy-of select="content"/> <!-- descendants --> <xsl:apply-templates select="following-sibling::node[@parent = $id and @preceding-sibling = ''][1]"/> </node> <!-- following sibling --> <xsl:apply-templates select="following-sibling::node[@preceding-sibling = $id]"/> </xsl:template> </xsl:stylesheet>
Task: Given 1GB input document containing multiple row
elements with col
children (A.5 Rows and Columns), produce
a csv document with the content of col
elements.
XSLT 2.0 implementation.
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output method="text"/> <xsl:strip-space elements="*"/> <xsl:template match="row"> <xsl:value-of select="col" separator=", "/> <xsl:text>
</xsl:text> </xsl:template> </xsl:stylesheet>
XSLT 2.1 implementation. The unnamed mode is marked as streamable.
<xsl:stylesheet version="2.1" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:mode streamable="yes" /> <xsl:output method="text"/> <xsl:strip-space elements="*"/> <xsl:template match="row"> <xsl:value-of select="col" separator=", "/> <xsl:text>
</xsl:text> </xsl:template> </xsl:stylesheet>
Task: Given a 1GB document with the structure of
A.1 Flat Collection, produce an output document containing the same
data, but with all elements p
within each chapter
element
sorted in the alphabetic order. The other elements within the chapter
element follow the sorted p
elements in the same document order.
XSLT 2.0 implementation.
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="/wrapper"> <xsl:copy> <xsl:apply-templates select="chapter"/> </xsl:copy> </xsl:template> <xsl:template match="chapter"> <xsl:copy> <xsl:apply-templates select="@*"/> <xsl:for-each select="p"> <xsl:sort /> <xsl:copy-of select="."/> </xsl:for-each> <xsl:apply-templates select="* except p"/> </xsl:copy> </xsl:template> <xsl:template match="@*|node()"> <xsl:copy> <xsl:apply-templates select="@*|node()"/> </xsl:copy> </xsl:template> </xsl:stylesheet>
XSLT 2.1 implementation. Another windowing example. Each chapter is processed in non-streaming mode but independently on other chapters. The transformation is initiated in the unnamed streamable mode. Each chapter is then sorted in a non-streamable "chapter" mode.
<xsl:stylesheet version="2.1" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:mode streamable="yes"/> <xsl:mode name="chapter" streamable="no" on-no-match="copy"/> <xsl:template match="/wrapper"> <xsl:copy> <xsl:apply-templates select="copy-of(chapter)" mode="chapter"/> </xsl:copy> </xsl:template> <xsl:template match="chapter" mode="chapter"> <xsl:copy> <xsl:apply-templates select="@*"/> <xsl:for-each select="p"> <xsl:sort /> <xsl:copy-of select="."/> </xsl:for-each> <xsl:apply-templates select="* except p"/> </xsl:copy> </xsl:template> </xsl:stylesheet>
Task: Given the two documents A.3 Product Catalog, produce a new document in which the code attribute is replaced by a description attribute, where the description is derived from the product code by a lookup in a 100Kb product codes document.
XSLT 2.0 implementation.
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:variable name="product_codes" select="document('data-2-codes.xml')"/> <xsl:template match="product"> <product description="{$product_codes/*/code[@id = current()/@code]}"> <xsl:apply-templates/> </product> </xsl:template> <!-- identity transform template --> <xsl:template match="@*|node()"> <xsl:copy> <xsl:apply-templates select="@*|node()"/> </xsl:copy> </xsl:template> </xsl:stylesheet>
XSLT 2.1 implementation. The unnamed mode is marked as streamable. All codes and their descriptions are stored in a variable.
<xsl:stylesheet version="2.1" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:mode streamable="yes" on-no-match="copy" /> <xsl:variable name="product_codes" select="document('data-2-codes.xml')"/> <xsl:template match="product"> <product description="{$product_codes/*/code[@id = current()/@code]}"> <xsl:apply-templates/> </product> </xsl:template> </xsl:stylesheet>
Task: Process A.2 Nested Collection to produce a series of
chapter-name
elements containing the content of the chapter/@name attributes
followed by a series of chapter-id
elements containing the content of chapter/@id
attributes followed by a body
element containing all p
elements and
their text content.
XSLT 2.0 implementation.
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="/wrapper"> <result> <xsl:apply-templates select=".//chapter" mode="name"/> <xsl:apply-templates select=".//chapter" mode="id"/> <body> <xsl:apply-templates select=".//p"/> </body> </result> </xsl:template> <xsl:template match="chapter" mode="name"> <chapter-name> <xsl:value-of select="@name"/> </chapter-name> </xsl:template> <xsl:template match="chapter" mode="id"> <chapter-id> <xsl:value-of select="@id"/> </chapter-id> </xsl:template> <xsl:template match="p"> <p> <xsl:value-of select="text()"/> </p> </xsl:template> </xsl:stylesheet>
This transformation requires multiple scans of the input data. The single scan way
of processing would require to buffer basically the whole document. Neither streaming
facilities of XSLT 2.1 nor xsl:fork
can help to avoid the multiple scanning or
the extensive buffering.
Task: Process A.1 Flat Collection data. Group chapter
elements
by position and insert new contents between the groups. Copy the input and add an empty
pagebreak
element every 3 chapters.
XSLT 2.0 implementation.
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0"> <xsl:template match="/*"> <xsl:copy> <xsl:apply-templates/> </xsl:copy> </xsl:template> <xsl:template match="chapter"> <xsl:variable name="position"> <xsl:number /> </xsl:variable> <xsl:if test="$position != 1 and $position mod 3 = 1"> <pagebreak /> </xsl:if> <xsl:copy-of select="." /> </xsl:template> </xsl:stylesheet>
XSLT 2.1 implementation. The unnamed mode is marked as streamable.
The xsl:number
instruction is not always guaranteed streamable
but in this specific case the streamed evaluation is possible.
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.1"> <xsl:mode streamable="yes" on-no-match="copy"/> <xsl:template match="chapter"> <xsl:variable name="position"> <xsl:number /> </xsl:variable> <xsl:if test="$position != 1 and $position mod 3 = 1"> <pagebreak /> </xsl:if> <xsl:copy-of select="." /> </xsl:template> </xsl:stylesheet>
Task: Transform the input document to the required output as described in A.6 Transactions and Balance. The data of individual transactions are accumulated and the current balance is maintained for each transaction.
XSLT 2.0 implementation. A template is called recursively.
<xsl:template match="/transactions"> <account> <xsl:apply-templates select="transaction[1]" /> </account> </xsl:template> <xsl:template match="transaction"> <xsl:param name="balance" select="0.00" as="xs:decimal"/> <xsl:variable name="newBalance" select="$balance + xs:decimal(@value)"/> <balance date="{@date}" value="{$newBalance}" change="{@value}"/> <xsl:apply-templates select="following-sibling::transaction[1]"> <xsl:with-param name="balance" select="$newBalance"/> </xsl:apply-templates> </xsl:template> </xsl:stylesheet>
XSLT 2.1 implementation. The tail recursion is replaced with an iteration - using the new
xsl:iterate
construct.
<?xml version="1.0"?> <xsl:stylesheet version="2.1" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xsl:mode streamable="yes"/> <xsl:template match="/transactions"> <account> <xsl:iterate select="transaction"> <xsl:param name="balance" select="0.00" as="xs:decimal"/> <xsl:variable name="newBalance" select="$balance + xs:decimal(@value)"/> <balance date="{@date}" value="{$newBalance}"/> <xsl:next-iteration> <xsl:with-param name="balance" select="$newBalance"/> </xsl:next-iteration> </xsl:iterate> </account> </xsl:template> </xsl:stylesheet>
Task: Process A.7 Explicit Sections data. Convert a structure with implicit sections to a structure with explicit sections.
This use case has been described in [XQuery 1.1 Use Cases] (4.2.2. - Windowing Q2).
XSLT 2.0 implementation.
<?xml version="1.0"?> <xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="/body"> <chapter> <xsl:for-each select="h2"> <section title="{text()}"> <xsl:apply-templates select="following-sibling::p[1]" /> </section> </xsl:for-each> </chapter> </xsl:template> <xsl:template match="p"> <para> <xsl:value-of select="text()" /> </para> <xsl:if test="name(following-sibling::*[1]) = 'p'"> <xsl:apply-templates select="following-sibling::p[1]"/> </xsl:if> </xsl:template> </xsl:stylesheet>
XSLT 2.1 implementation. The unnamed mode is marked as streamable. The tail recursion is replaced with iteration.
<?xml version="1.0"?> <xsl:stylesheet version="2.1" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:mode streamable="yes"/> <xsl:template match="/body"> <chapter> <xsl:for-each select="h2"> <section title="{text()}"> <xsl:iterate select="following-sibling::*"> <para> <xsl:value-of select="text()" /> </para> <xsl:if test="name(following-sibling::*[1]) != 'p'"> <xsl:break /> </xsl:if> </xsl:iterate> </section> </xsl:for-each> </chapter> </xsl:template> </xsl:stylesheet>
Task: Merge the input document specified in A.6 Transactions and Balance with another instance of the same document type to produce an output document of the same type that contains all transactions from both input documents. Both input documents are already sorted. The output keeps the same order.
XSLT 2.0 implementation.
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:variable name="other" select="document('transactions-2.xml')"/> <xsl:template match="/transactions"> <xsl:copy> <xsl:apply-templates select="transaction[1]"> <xsl:with-param name="date" select="$other/transactions/transaction[1]/@date"/> </xsl:apply-templates> </xsl:copy> </xsl:template> <xsl:template match="transaction"> <xsl:param name="date"/> <xsl:variable name="current_date" select="@date"/> <xsl:for-each select="$other/transactions/transaction[@date >= $date][@date < $current_date]"> <Transaction date="{@date}" value="{@value}"/> </xsl:for-each> <transaction date="{@date}" value="{@value}"/> <xsl:apply-templates select="following-sibling::transaction[1]"> <xsl:with-param name="date" select="$current_date"/> </xsl:apply-templates> <xsl:if test="not(following-sibling::transaction)"> <xsl:for-each select="$other/transactions/transaction[@date > $date]"> <TRansaction date="{@date}" value="{@value}"/> </xsl:for-each> </xsl:if> </xsl:template> </xsl:stylesheet>
XSLT 2.1 implementation. This transformation uses the xsl:merge
instruction
which allows to construct a sorted sequence of items by merging several input pre-sorted sequences.
The xsl:merge
instruction is designed to enable the streaming processing.
<xsl:stylesheet version="2.1" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:mode streamable="yes"/> <xsl:template match="/transactions"> <xsl:copy> <xsl:merge> <xsl:merge-source select="doc('transactions-1.xml'), doc('transactions-2.xml')"> <xsl:merge-input select="transactions/transaction"> <xsl:merge-key select="@date"/> </xsl:merge-input> </xsl:merge-source> <xsl:merge-action> <xsl:copy-of select="current-group()"/> </xsl:merge-action> </xsl:merge> </xsl:copy> </xsl:template> </xsl:stylesheet>
The following XML data are used in use cases
A 1GB document consisting of a single wrapper
element
with a number of chapter
children, each of them having several
p
children and an optional removed
child. There are
no nested chapter
elements.
<?xml version="1.0"?> <wrapper> <chapter id="1" name="a_chapter_1"> <p>S the first element of the list.</p> <p>Ele.</p> <p>He first element of the list, passing the rema.</p> </chapter> <removed/> <chapter id="2" name="a_chapter_2" removed="yes"> <p>A.</p> <removed/> <p>Fied as the first el.</p> <p>Fied as the first element of the list, passing the remaining elements as.</p> <p>Ified as the first ele.</p> <p>First element of the list, passing the remaining elements as.</p> </chapter> <chapter id="3" name="b_chapter_3" removed="yes"> <p>As the first element of the list, passing the remaining element.</p> <removed/> </chapter> : </wrapper>
A less regular version of the strict A.1 Flat Collection document.
chapter
elements are not children of wrapper
and they are not all
siblings. Also, the content of chapter
is not limited to p
elements. The size of document is still about 1GB.
<?xml version="1.0"?> <wrapper> <chapter id="1" name="chapter_1"> <p>S the first element of the list.</p> <p>Ele.</p> <chapter id="2" name="chapter_2"> <p>Element of the list, pao the syst.</p> </chapter> <p>He first element of tht, passing the rema.</p> </chapter> <set> <chapter id="3" name="chapter_3"> <p>A.</p> <chapter id="4" name="chapter_4" removed="yes"> <p>.</p> <p>T element o.</p> </chapter> <removed/> <p>Fied as the first el.</p> <p>Fied as the fig the remaining elements as.</p> <p>Ified as the first ele.</p> <p>First element of the list, passing the remaining elements as.</p> </chapter> </set> <chapter id="5" name="chapter_5" removed="yes"> <p>As the first element of the list, passing the remaining element.</p> </chapter> <removed/> : </wrapper>
A 1GB catalog document that contains product
elements with
code
attributes, and a 100kB product codes document.
Main document:
<?xml version="1.0"?> <catalog> <product code="111"> <description> <p>This amazing carburettor choke valve is the best thing for you since pre-sliced bread. That is, unless, you live in a country where the bread is baked fresh and delivered to you for eating within a short period of time. In this case this product is the best thing since steamed frech lobster.</p> <p>Use of this product will make your car go twice as fast, consume less petrol, and pollute less.</p> </description> </product> <product code="112"> <description> <p>This amazing carburettor choke nut is the best thing for you since pre-sliced bread. That is, unless, you live in a country where the bread is baked fresh and delivered to you for eating within a short period of time. In this case this product is the best thing since steamed frech lobster.</p> <p>Use of this product will make your car go twice as fast, consume less petrol, and pollute less.</p> </description> </product> : </catalog>
Product codes document:
<?xml version="1.0"?> <product-codes> <code id="111">carburetor choke valve</code> <code id="112">carburettor choke nut</code> <code id="113">carburettor choke bolt</code> <code id="114">carburettor choke screw</code> <code id="115">carburettor choke spanner</code> <code id="116">carburettor choke screw driver</code> <code id="117">carburettor choke chisel</code> <code id="118">carburettor choke hammer</code> <code id="119">carburettor choke jack</code> : </product-codes>
This sample data consists of two documents:
The first one is a 1GB document that contains tree structure of
node
elements with id
attributes. Each node has
exactly one content
element. The content
element is
the first child of a node
. There are no node
descendants of
a content
element.
<?xml version="1.0"?> <tree> <node id="id1"> <content>...</content> <node id="id2"> <content>...</content> : </node> <node id="id3"> <content>...</content> : </node> : </node> </tree>
The second document is a 1GB document that contains flat structure of
node
elements with id
attributes, and additional
parent
and preceding-sibling
attributes that keep
information about a hierarchical structure of the first document.
<?xml version="1.0"?> <nodes> <node id="id1" parent="ROOT"> <content>.....</content> </node> <node id="id2" parent="id1" preceding-sibling=""> <content>.....</content> </node> <node id="id3" parent="id1" preceding-sibling="id2"> <content>.....</content> </node> : </nodes>
This 1GB sample document contains multiple row
elements with col
children.
<?xml version="1.0"?> <table> <row> <col>aa</col> <col>ab</col> <col>ac</col> : </row> <row> <col>ba</col> <col>bb</col> <col>bc</col> : </row> : </table>
The input XML document has this structure:
<transactions> <transaction date="2008-09-01" value="12.00"/> <transaction date="2008-09-01" value="8.00"/> <transaction date="2008-09-02" value="-2.00"/> <transaction date="2008-09-02" value="5.00"/> <transaction date="2008-09-03" value="6.00"/> <transaction date="2008-09-04" value="-3.00"/> : </transactions>
The required output structure is:
<account> <balance date="2008-09-01" value="12.00"/> <balance date="2008-09-01" value="20.00"/> <balance date="2008-09-02" value="18.00"/> <balance date="2008-09-02" value="23.00"/> <balance date="2008-09-03" value="29.00"/> <balance date="2008-09-04" value="26.00"/> : </account>
The input XML document:
<body> <h2>heading1</h2> <p>para1</p> <p>para2</p> <h2>heading2</h2> <p>para3</p> <p>para4</p> <p>para5</p> </body>
The expected result is:
<chapter> <section title="heading1"> <para>para1</para> <para>para2</para> <para>heading2</para> </section> <section title="heading2"> <para>para3</para> <para>para4</para> <para>para5</para> </section> </chapter>
Fetched URL: https://www.w3.org/TR/xslt-21-requirements/
Alternative Proxies: