XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition (734 page)

Read XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition Online

Authors: Michael Kay

BOOK: XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition

9.81Mb size Format: txt, pdf, ePub

The next thing we need to do is to adapt the stylesheet to run in the browser. To do this, we need to write an HTML page containing JavaScript to invoke the transformation.

This particular example runs in Internet Explorer 6 or later. It won't run on Firefox, because it uses the Microsoft API to invoke the XSLT transformations. If you want to write pages that run under several different browsers, consider using the Sarissa package (
http://sarissa.sourceforge.net
) which provides a cross-product API.

If the XML file is large (family trees produced by serious genealogists often run to several megabytes) then this approach means the user is going to have to wait rather longer to see the first page of data. But the advantage is that once it's downloaded, browsing around the file can be done offline: there is no need to go back to the server to follow each link from one individual to another. This gives the user a lightning-fast response to navigation requests, and reduces the processing load and the number of hits on the server. Another benefit, given that many genealogists only have access to the limited Web space provided by a commercial ISP, is that no special code needs to be installed on the server.

This time, the transformation is controlled from JavaScript code on an HTML page
famtree.html
. The page itself reads as follows. The

init();

The CSS style definitions have moved from the XSLT stylesheet to the HTML page, but they are otherwise unchanged.

The
init()
function on this page is called when the page is loaded. It creates two DOM objects, one for the source XML and one for the stylesheet, and loads these using the relative URLs
kennedy.xml
and
ms-person.xsl
. It then compiles the stylesheet into an object which is rather confusingly called an
XSLTemplate;
this corresponds directly with the JAXP
Templates
object. Finally, it calls the
refresh()
function to display the individual with identifier
I1
.

I've taken a bit of a short cut here. There's no guarantee that a GEDCOM file will contain an individual with this identifier. A more carefully constructed application would display the first individual in the file, or an index of people.

The
refresh()
function creates an executable instance of the stylesheet by calling the
createProcessor()
method on the
XSLTemplate
object. It then sets the value of the global
id
parameter in the stylesheet, and applies the stylesheet to the source document by calling the
transform()
method. The HTML constructed by processing the stylesheet is then written to the contents of the
id=“displayarea”>
element in the body of the HTML page.

We can use the same stylesheet as before, again with modifications to the form of the hyperlinks. This time we want a hyperlink to another individual,
I2
say, to take the form:

Jaqueline Lee Bouvier

When the user clicks on this hyperlink, the
refresh()
function is executed, which causes a new execution of the compiled stylesheet, against the same source document, but with a different value for the
id
parameter. The effect is that the contents of the page switches to display a different individual.

The
ms-person.xsl
stylesheet is written by importing the
person10.xsl
stylesheet, which is the XSLT 1.0 version of the
person.xsl
stylesheet presented earlier in this chapter, and then overriding the aspects we want to change. This time there are two changes: we want to change the form of the hyperlink, and we want to leave out the generation of the CSS style, because the necessary definitions are already present on the HTML page. Here is the stylesheet:

xmlns:xsl=“http://www.w3.org/1999/XSL/Transform”

version=“1.0”

‘

select=“concat(‘Javascript:refresh(‘, $apos, ../@Id, $apos, ‘)’)”/>

One slight infelicity in the resulting stylesheet is that it generates a full HTML page, complete with

,

, and

elements, and then inserts this as the content of a

element within an existing HTML page. Fortunately, Internet Explorer tolerates this abuse of the HTML specification rather well.

Summary

I hope this little excursion into the strange world of genealogical data models has given you some flavor of the power of XSLT as a manipulation and reporting tool for complex structured data. We've covered a lot of ground:

Using XSLT to transform structured data that wasn't originally in XML format
How to navigate your way around complex linked data within an XML document
Several different ways of generating an interactive view of a large XML data set:
- Generating lots of static HTML pages in one go at publication time
- Generating HTML pages dynamically using either a Java servlet or a Microsoft ASP.NET page
- Generating HTML incrementally within the browser

The worked example in the next chapter will venture into even stranger territory, using XSLT to solve a chess problem. While genealogy has demonstrated how XSLT can be used to process complex data, the chess example will show something of the computational power of the language.

Chapter 20

Case Study: Knight's Tour

This chapter contains the third (and last) of the XSLT case studies. It shows how XSLT can be used to calculate a knight's tour of the chessboard, in which the knight visits every square without ever landing on the same square twice.

New features in XSLT 2.0 make this kind of application much easier to write, which means that the stylesheet is almost a total rewrite of the XSLT 1.0 version.

Readers of previous editions of this book have reacted differently to this case study. Some have suggested that I should be less frivolous, and stick to examples that involve the processing of invoices and purchase orders, and the formatting of product catalogs. Others have welcomed the example as light relief from the comparatively boring programming tasks they are asked to do in their day job. A third group has told me that this example is absolutely typical of the challenges they face in building real Web sites. The Web, after all, does not exist only (or even primarily) to oil the wheels of big business. It is also there to provide entertainment.

Whatever your feelings about the choice of problem, I hope that by showing that it can be done I will convince you that XSLT has the computational power and flexibility to tackle any XML formatting and transformation challenge, and that as you study it, you will discover ideas that you can use a wide range of tasks that are more typical of your own programming assignments.

The Problem

The purpose of the stylesheet is to produce a knight's tour of the chessboard, in which each square is visited exactly once, as shown in the illustration overleaf. A knight can move to any square that is at the opposite corner of a 3 × 2 rectangle (see
Figure 20-1
).

Figure 20-1

The only input to the stylesheet is an indication of the starting square: in modern chess notation, the columns are denoted by the letters a–h starting from the left, and the rows by the numbers 1–8, starting at the bottom. We'll supply the starting square as a parameter to the stylesheet. The stylesheet doesn't need to get anything from the source document. In fact, with XSLT 2.0, there doesn't need to be a source document: the entry point to the stylesheet can be specified as a named template.

We'll build up the stylesheet piece by piece: you can find the complete stylesheet,
tour.xsl
, on the Wrox Web site.

The inspiration for this stylesheet came from Oren Ben-Kiki, who published a stylesheet for solving the eight-queens problem. The concept here is very similar, though the details are quite different.

The Algorithm

The strategy for getting the knight round the board is based on the observation that if a square hasn't been visited yet, and if it isn't the knight's final destination, then it had better have at least two unvisited squares that are a knight's move away from it, because there needs to be a way of getting in and another way of getting out. That means that if we can get to a square that's only got one exit left, we'd better go there now or we never will.

This suggests an approach where at each move, we look at all the squares we can jump to next, and choose the one that has fewest possible exits. It turns out that this strategy works, and always gets the knight round the board.

It's possible that this could lead the knight into a blind alley, especially in the case where two of the possible moves look equally good. In this case, the knight might need to retrace its steps and try a different route. In the version of the stylesheet that I published in the previous edition of this book, I included code to do this backtracking, but made the assertion that it was never actually used (though I couldn't prove why). Recently, one of my readers reported that if the knight starts on square f8, it does indeed take a wrong turning at move 58, and needs to retrace its steps. Moreover, this appears to be the only case where this happens.

The place I usually start design is with the data structures. Here the main data structure we need is the board itself. We need to know which squares the knight has visited, and so that we can print out the board at the end, we need to know the sequence in which they were visited. In XSLT 2.0 the obvious choice is to represent the board as a sequence of 64 integers: the value will be zero for a square that has not been visited, or a value in the range 1 to 64 representing the number of the move on which the knight arrived at this square.

In a conventional program this data structure would probably be held in a global variable and updated every time the knight moves. We can't do this in XSLT, because variables can't be updated. Instead, every time a function is called, it passes the current state of the board as a parameter, and when the knight moves, a new copy of the board is created, that differs from the previous one only in the details of one square.

It doesn't really matter which way the squares are numbered, but for the sake of convention we'll number them as shown in
Figure 20-2
.

Figure 20-2

So if we number the rows 0–7, and the columns 0–7, the square number is given as
row * 8 + column
, and then we add one if indexing into the sequence representing the board.

Having decided on the principal data structure, we can decide the broad structure of the program. There are three stages:

1.
Prepare the initial data structures (the empty board with a knight placed on it, somewhere).

2.
Calculate the tour.

3.
Display the final state of the board.

Calculating the tour involves 63 steps, each one taking the form:

1.
Find all the unvisited squares that the knight can move to from the current position.

2.
For each one of these, count the number of exits (that is, the number of unvisited squares that can be reached from there).

3.
Choose the square with the fewest exits, and move the knight there.

We're ready to start coding. The tricky bit, as you've probably already guessed, is that all the loops have to be coded using recursion. That takes a bit of getting used to at first, but it quickly becomes a habit.

The Initial Template

Let's start with the framework of top-level elements:

xmlns:xsl=“http://www.w3.org/1999/XSL/Transform”

xmlns:xs=“http://www.w3.org/2001/XMLSchema”

xmlns:tour=“http://www.wrox.com/5067/tour”

exclude-result-prefixes=“xs tour”

version=“2.0”

select=“number(translate(substring($start, 1, 1),

‘abcdefgh’, ‘01234567’))”/>

select=“8 - number(substring($start, 2, 1))”/>

…

All I'm doing here is declaring the global parameter,
start
, which defines the starting square, and deriving from it two global variables: a row number and column number.

Some observations:

The parameter
start
has the default value
a1
. As this is a string-value, it needs to be in quotes; these quotes are additional to the quotes that surround the XML attribute. If I had written
select=“a1”
, the default value would be the string-value of the

element child of the document root.
The simplest way of converting the alphabetic column identifier (a–h) into a number (0–7) is to use the
translate()
function, which is described in Chapter 13.
The row number is subtracted from 8 so that the lowest-numbered row is at the top, and so that row numbers start from zero. Numbering from zero makes it easier to convert between row and column numbers and a number for each square on the board in the range 0–63.
I haven't yet checked that the supplied start square is valid. I'll do that in the initial template.

Other books

Ruin Me by Cara McKenna

Charmed and Dangerous by Jane Ashford

Daddy Warlock by Jacqueline Diamond

Blockbuster by H. I. Larry

Wendigo by Bill Bridges

The Present by Nancy Springer

The Nassau Secret (The Lang Reilly Series Book 8) by Gregg Loomis

Legacy of the Clockwork Key by Kristin Bailey

Tell No Tales by Eva Dolan

Tales From Development Hell by David Hughes