XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition (533 page)

BOOK: XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition
6.5Mb size Format: txt, pdf, ePub

This doesn't mean that each character must be considered in isolation. The collation can still consider characters in groups, as with the traditional rule in Spanish that
ch
collates as if it were a single character following
c
, and
ll
as a single letter after
l
. But where characters are grouped in this way, it is likely to affect the way substrings are matched, as we will see.

The XPath specification isn't completely prescriptive about how substring matching using a collation should work, and there are several possible approaches that an implementation could use. I'll describe the way the Saxon processor does it, which makes heavy use of the collation support in Java: other Java-based processors are therefore quite likely to be similar.

Firstly, let's look at a case where Java treats one character as two collation units. With a primary strength collation for German, the string
Straße
generates a sequence of seven collation units, which are exactly the same as the collation units generated for the string
strasse
. This means that
contains(“Straße”, $t)
returns true when
$t
is any one of
ß
,

,
ße
,
ss
,
as

Other books

Unexpected by Lori Foster
Whisky From Small Glasses by Denzil Meyrick
Scavenger Reef by Laurence Shames
The Other Wind by Ursula K. Le Guin
Deep Space Dead by Chilvers, Edward
Take a dip by Wallace, Lacey
Making the Cut by SD Hildreth
tmp0 by user
ENTANGLED by Eden, Cynthia, Kreger, Liz, Mayer, Dale, Miles, Michelle, Edie Ramer, Misty Evans,, Estep, Jennifer, Haddock, Nancy, Brighton, Lori, Diener, Michelle, Brennan, Allison