XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition (224 page)

BOOK: XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition
7.05Mb size Format: txt, pdf, ePub

The
elements
attribute of

must contain a whitespace-separated list of
NameTests
. The form of a
NameTest
is defined in the XPath expression language; see Chapter 9, page 614. Each form of
NameTest
has an associated priority. The different forms of
NameTest
and their meanings are:

The priority is used when conflicts arise. For example, if the stylesheet specifies:



then whitespace-only text nodes appearing within a

or

will be preserved. Even though these elements match both the

and the

, the
NameTest
in the latter has higher priority (0 as compared to –0.5).

If there is an

element that matches the parent element, and also an

element that matches, then the decision depends on the import precedence and priority of the respective rules. Taking into consideration all the

and

elements that match the parent element of the whitespace-only text node, the XSLT processor takes the one with highest import precedence (as defined in the rules for

on page 359). If there is more than one element with this import precedence, it takes the one with highest priority, as defined in the table above. If there is still more than one, and they are different (one preserve, one strip), the processor may either report an error, or choose the one that comes last in declaration order.

In deciding whether to strip or preserve a whitespace-only text node, only its immediate parent element is considered in the above rules. The rules for its other ancestors make no difference. The element itself, of course, is never removed from the tree: the stripping process will only remove text nodes.

Regardless of the

and

declarations, if an individual element has the XML-defined attribute
xml:space=“preserve”
, then all descendant text nodes are preserved, unless this is cancelled by
xml:space=“default”
. If an

doesn't seem to be having any effect, one possible reason is that the element type in question is declared in the DTD to have an
xml:space
attribute with a default value of
preserve
. There is no way of overriding this in the stylesheet.

The

and

declarations in the stylesheet are also ignored for a whitespace text node that forms the content of an element defined in the schema to have simple content. This is because changing the content of such an element could make it invalid. For example, if the schema defines the type as having

, then a value consisting of a single space is valid, but the element becomes invalid if the space is removed.

Usage

For many categories of source document, especially those used to represent data structures, whitespace- only text nodes are never significant, so it is useful to specify:


which will remove them all from the tree. There are two main advantages in stripping these unwanted nodes:

  • When

    is used with a default
    select
    attribute, all child nodes will be processed. If whitespace-only text nodes are not stripped, they too will be processed, probably leading to the whitespace being copied to the output destination.
  • When the
    position()
    function is used to determine the position of an element relative to its siblings, the whitespace-only text nodes are included in the count. This often leads to the significant nodes being numbered 2, 4, 6, 8, . . . .

Generally speaking, it is a good idea to strip whitespace-only text nodes belonging to elements that have element content, that is, elements declared in the DTD as containing child elements but no
#PCDATA
, or declared in a schema to have a complex type with
mixed=“no”
.

By contrast, stripping whitespace-only text nodes from elements with mixed content (elements declared in the DTD or schema to contain both child elements and
#PCDATA
) is often a bad idea. For example, consider the element below:

He went to Balliol College Oxford to read

Greats

The space between the

element and the

element is a whitespace-only text node, and it should be preserved, because otherwise when the tags are removed by an application that's only interested in the text, the words
College
and
Oxford
will run together.

Other books

Bell, Book, and Scandal by Jill Churchill
THUGLIT Issue Four by Abbott, Patti, Wiebe, Sam, Beetner, Eric, Tucher, Albert, Hobbs, Roger, Irvin, Christopher, Sim, Anton, Crowe, Garrett
Entwined by Heather Dixon
Rose Tinted by Shannen Crane Camp
The Stolen by Alexx Andria
King Lear by William Shakespeare
The Second Assistant by Clare Naylor, Mimi Hare
My Lady Notorious by Jo Beverley
Hunting of the Last Dragon by Sherryl Jordan