XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition (620 page)

BOOK: XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition
12.14Mb size Format: txt, pdf, ePub

Unicode combining and non-spacing characters are counted individually, unless the implementation has normalized them. The implementation is allowed to turn strings into normalized form, but is not required to do so. In normalized form NFC, accents and diacriticals will frequently be merged with the letter that they modify into a single character. To assure yourself of consistent answers in such cases, the
normalize-unicode()
function should be called to force the string into normalized form.

Examples

These examples assume that the XPath expression is used in a host language that expands XML entity references and numeric character references; for example, XSLT or XQuery.

Expression
Result
string-length(“abc”)
3
string-length(“<>”)
2
string-length(“”“”)
1
string-length(“”)
0
string-length(‘�’)
1
string-length(‘𠀀’)
1

Usage

The
string-length()
function can be useful when deciding how to allocate space on the output medium. For example, if a list is displayed in multiple columns then the number of columns may be determined by some algorithm based on the maximum length of the strings to be displayed.

It is
not
necessary to call
string-length()
to determine whether a string is zero-length, because converting the string to an
xs:boolean
, either explicitly using the
boolean()
function, or implicitly by using it in a boolean context, returns
true
only if the string has a length of one or more. For the same reason, it is not usually necessary to call
string-length()
when processing the characters in a string using a recursive iteration, since the terminating condition when the string is empty can be tested by converting it to a boolean.

See Also

normalize-unicode()
on page 847

substring()
on page 883

string-to-codepoints

The
string-to-codepoints()
function returns a sequence of integers representing the Unicode codepoints of the characters in a string. For example,
string-to-codepoints(“A”)
returns 65.

Signature

Argument
Type
Meaning
input
xs:string?
The input string
Result
xs:integer*
The codepoints of the characters in the input string

Effect

If an empty sequence or a zero-length string is supplied as the input, the result is an empty sequence.

In other cases, the result contains a sequence of integers, one for each character in the input string. Characters here are as defined in Unicode and XML: a character above xFFFF that is represented as a surrogate pair counts as one character, not two. The integers that are returned will therefore be in the range 1 to x10FFFF (decimal 1114111).

Examples

Expression
Result
string-to-codepoints(“ASCII”)
65, 83, 67, 73, 73
string-to-codepoints(“𘚠”)
100000
string-to-codepoints(“”)
()

See Also

codepoints-to-string()
on page 725

subsequence

The
subsequence()
function returns part of an input sequence, identified by the start position and length of the subsequence required.

For example the expression
subsequence((“a”, “b”, “c”, “d”), 2, 2)
returns
(“b”, “c”)
.

Signature

Argument
Type
Meaning
sequence
item()*
The input sequence.
start
xs:double
The position of the first item to be included in the result.
length
(optional)
xs:double
The number of items to be included in the result. If this argument is omitted, all items after the start position are included.
Result
item()*
The sequence of items starting at the start position

Effect

The two-argument version of the function is equivalent to:

 $sequence[position()>= round($start)]

The three-argument version is equivalent to:

$sequence[position()>= round($start)

                and position() < (round($start) + round($length))]

A consequence of these rules is that there is no error if the
start
or
length
arguments are out of range. Another consequence is that if the
start
or
length
arguments are NaN, the result is an empty sequence.

The arguments are defined with type
xs:double
for symmetry with the
substring()
function, which itself uses
xs:double
arguments for backward compatibility with XPath 1.0, which did not support any numeric type other than double. If you supply an integer, it will automatically be converted to a double. The fact that they are doubles rather than integers is occasionally convenient because the result of a calculation involving untyped values is a double. For example:

subsequence($seq, 1, @limit + 1)

works even when the
limit
attribute is untyped, in which case the value of
@limit+1
is an
xs:double
.

Examples

Expression
Result
subsequence(3 to 10, 2)
4, 5, 6, 7, 8, 9, 10
subsequence(3 to 10, 5, 2)
7, 8
subsequence(1 to 5, 10)
()
subsequence(1 to 10, 2.3, 4.6)
2, 3, 4, 5, 6

Other books

Small Lives by Pierre Michon
China Wife by Hedley Harrison
the little pea by Erik Battut
The Laughing Gorilla by Robert Graysmith