XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition (464 page)

BOOK: XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition
11.79Mb size Format: txt, pdf, ePub

Converting from untypedAtomic

The rules for conversion from an
untypedAtomic
value to any other type are exactly the same as the rules for converting from an equivalent string. See page 663.

Converting between Derived Types

The previous section listed all the permitted conversions between primitive atomic types. Now we need to consider what happens if the supplied value belongs to a derived type, or if the destination type is a derived type. Note that we are still only concerned with atomic types. The destination type of a cast cannot be a list or union type. It may however be a type that is derived by restriction. This includes both built-in derived types such as
xs:integer
,
xs:short
, and
xs:Name
, and also user-defined derived types, provided that they are named types in a schema that has been imported in the static context of the XPath expression.

The case where the supplied value belongs to a derived type is easy. As always, the principle of substitutability holds: a value of a subtype may always be used as input to an operation that accepts values belonging to its supertype. This means that conversion from a derived type to its base type is always successful. However, there is one minor caveat. In the tables in the previous section, conversion of a value to its own primitive type is always described with the rule “The value is returned unchanged.” However, if the source value belongs to a subtype of the primitive type (that is, a type derived by restriction from the primitive type), this rule should be amended to read “The value is returned unchanged, but with the type label set to the destination type”. For example, if you cast the value
xs:short(2)
to the type
xs:decimal
, the type label on the result will be
xs:decimal
. In fact, it is always a rule for casting operations that the type label on the result value is the type that you were casting to.

For the second case, casting to a derived type, there are a number of different rules that come into play, and we will consider them in the following sections.

  • If the supplied value is of type
    xs:string
    or
    xs:untypedAtomic
    , then casting is designed to follow the same rules as schema validation. This is described in the next section:
    Casting from xs:string to a Derived Type
    .
  • The general rule when casting from types other than
    xs:string
    or
    xs:untypedAtomic
    is the “up, across, down” rule. This is described in
    Casting Non-string Values to a Derived Type
    on page 666.
  • There are some special rules that apply when casting to one of the three built-in derived types
    xs:integer
    ,
    xs:dayTimeDuration
    , and
    xs:yearMonthDuration
    . These are described in the sections
    Casting to an xs:integer
    on page 667, and
    Casting to xs:yearMonthDuration and xs:dayTimeDuration
    on page 668.

Certain derived schema types, notably
xs:ID
,
xs:IDREF
,
xs:NOTATION
, and
xs:ENTITY
have associated constraints that a schema validator will check at the level of the document as a whole: for example
xs:ID
values must be unique,
xs:IDREF
values must match an
xs:ID
somewhere in the document,
xs:NOTATION
and
xs:ENTITY
values must refer to a notation or entity declared in the DTD. These rules are not enforced when casting, because there is no containing document to provide context.

Casting from xs:string to a Derived Type

This section describes what happens when the source value of a cast is an instance of
xs:string
or a type derived from
xs:string
, or when it is an instance of
xs:untypedAtomic
, and when the target type is a derived type. This includes the case where the target type is itself a subtype of
xs:string
. (Casting from
xs:string
to another primitive type was described on page 663).

The design is intended to imitate what happens when a string making up the content of an element or attribute in raw lexical XML is put through schema validation, when the type defined for the element or attribute is the same as the atomic type used as the target of the cast operation.

The stages are as follows:

1.
The supplied value is converted to an instance of
xs:string
. This always succeeds.

2.
Whitespace normalization is applied, as defined by the
whiteSpace
facet for the target type. This takes one of the values
preserve
,
replace
, or
collapse
. If the value is
replace
, then any occurrence of the characters tab (x09), newline (x0A), or carriage return (x0D) is replaced by a single space character. If the value is
collapse
, then whitespace is processed using the rules of the
normalize-space()
function described on page 845.

Most types, including most subtypes of
xs:string
, have a
whiteSpace
facet of
collapse
. The
xs:string
type itself uses the value
preserve
, and the built-in type
xs:normalizedString
(despite its name) uses
replace
.

3.
The lexical value obtained after whitespace normalization is checked against the pattern facets of the target type (which include the pattern facets of its supertypes). The cast fails if the string does not match these regular expressions. Note that multiple patterns specified in the same simple type definition are alternatives (the string must match one of the patterns), while patterns on different levels are cumulative (the string must match them all).

4.
The value is then converted to the primitive supertype of the target type, using the rules for converting a string to another primitive type given on page 663.

5.
The resulting value is checked to ensure that it is in the value space of the derived target type, that is, to ensure that it conforms to all the other facets defined on that type.

6.
Finally, the result is constructed by taking the value determined in step 4 and attaching the name of the target type as the type label.

Converting Non-string Values to a Derived Type

This section describes the rules for casting any value other than an
xs:string
or
xs:untypedAtomic
to a derived type, including a type derived from
xs:string
. There are some exceptions to these rules when the target type is
xs:integer
,
xs:dayTimeDuration
, or
xs:yearMonthDuration
; these are covered in subsequent sections.

The general rule is “go up, then across, then down”. For example, if you are converting from a subtype of
xs:decimal
to a subtype of
xs:string
, you first convert the supplied value
up
to an
xs:decimal
, then you convert the
xs:decimal
across
to an
xs:string
, and then you convert the
xs:string
down
to the final destination type. Of course, any of the three stages in this journey may be omitted where it isn't needed. See
Figure 11-2
, which shows casting from
xs:long
to
xs:token
.

The last leg of this journey, the
down
part, now needs to be explained.

The rule here is (in general) that the value is not changed, but it is validated against the restrictions that apply to the subtype. These restrictions are defined by facets in the schema definition of the type. If the value satisfies the facets, then the cast succeeds and the result has the same value as the source, but with a new type label. If the value does not satisfy the facets, then the cast fails. For example, the expression
xs:positiveInteger(-5)
will cause an error, because the value
-5
does not satisfy the
minInclusive
facet for the type (which says that the lowest permitted value is zero).

There is a slight complication with the pattern facet. This facet defines a regular expression that the value must conform to. The pattern facet, unlike all the others, is applied to the lexical value rather than the internal value. To check whether a value conforms to the pattern facet, the system must first convert the value to a string. This is bad news if the pattern facet has been used to constrain input XML documents to use a form other than the canonical representation; for example, to constrain an
xs:boolean
attribute to the values
0
and
1
. The conversion to a string will produce the value
true
or
false
and will therefore fail the pattern validation. Generally speaking, using the pattern facet with types other than string (or string-like types such as
xs:anyURI
) is best avoided.

Other books

Life Sentences by Tekla Dennison Miller
The Widow's Son by Thomas Shawver
Missing Reels by Farran S Nehme
Becoming Madame Mao by Anchee Min
The Year of the Rat by Clare Furniss
The Seduction by Julia Ross