Read HTML The Definitive Guide Online
Authors: Chuck Musciano Bill Kennedy
Up to this point, we've dealt with HTML documents as standalone entities, concentrating on the language elements you use for structure and to format your work. The true power of HTML, however, lies in its ability to join collections of documents together into a full library of information, and to link documents with other collections around the world. Just as readers have considerable control over how the document looks onscreen, with hyperlinks they also have control over the order of presentation as they navigate through your information. It's the "HT" in HTML - hypertext - and it's the twist that spins the Web.
7.1 Hypertext Basics
A fundamental feature of hypertext is that you can hyperlink documents; you can point to another place inside the current document, inside another document in the local collection, or inside a document anywhere on the Internet. The documents thereby become an intricately woven web of information. Get the name analogy now? The target document is usually somehow related to and enriches the source; the linking element in the source should convey that relationship to the reader.
Hyperlinks can be used for all kinds of effects. They can be used inside tables of contents and lists of topics. With a click of the mouse on their browser screen, readers select and automatically jump to a topic of interest in the same document or to another document located in an entirely different collection somewhere around the world.
Hyperlinks also point readers to more information about a mentioned topic. "For more information, seèKumquats on Parade,'" for example. HTML authors use hyperlinks to reduce repetitive information. For instance, we recommend you sign your name to each of your documents. Rather than include full contact information in each document, a hyperlink connects your name to a single place that contains your address, phone number, and so forth.
A hyperlink, or
anchor
in HTML standard parlance, is marked by the tag and comes in two flavors. As we describe in detail later, one type of anchor creates a hot spot in the document that, when activated and selected (usually with a mouse) by the user, causes the browser to link. It automatically loads and displays another portion of the same or another document altogether, or triggers some Internet service-related action, such as sending email or downloading a special file. The other type of anchor creates a label, a place in an HTML document that can be referenced as a hyperlink.[
1
]
[1] Both types of HTML anchors use the same tag; perhaps that's why they have the same name. Nonetheless, we find it's easier if you differentiate them and think of the one type that provides the hotspot and address of a hyperlink as the "link," and the other type that marks the target portion of a document as the "anchor."
There also are some mouse-related events associated with hyperlinks, which, through the new JavaScript technologies, let you perform some new and exciting effects.
6.3 Layers
7.2 Referencing Documents:
The URL
7.2 Referencing Documents: The URL
As we discussed earlier, every document on the World Wide Web has a unique address. (Imagine the chaos if they didn't.) The document's address is known as its
uniform resource locator
(URL).[2]
[2] "URL" usually is pronounced "you are ell," not "earl."
Several HTML tags include a URL attribute value, including hyperlinks, inline images, and forms. All use the same URL syntax to specify the location of a web resource, regardless of the type or content of that resource. That's why it's known as a
uniform
resource locator.
Since they can be used to represent almost any resource on the Internet, URLs come in a variety of flavors. All URLs, however, have the same top-level syntax:
scheme
:
scheme_specific_part
The
scheme
describes the kind of object the URL references; the
scheme_specific_part
is, well, the part that is peculiar to the specific scheme. The important thing to note is that the
scheme
is always separated from the
scheme_specific_part
by a colon with no intervening spaces.
7.2.1 Writing a URL
URLs are written using the displayable characters in the US-ASCII character set. For example, surely you have heard what has become annoyingly common on the radio for an announced business website, "h, t, t, p, colon, slash, slash, w, w, w, widgets, dot, com." That's a simple URL, written: http://www.widgets.com If you need to use a character in a URL that is not part of this character set, you must encode the character using a special notation. The encoding notation replaces the desired character with three characters: a percent sign and two hexadecimal digits whose value corresponds to the position of the character in the ASCII character set.
This is easier than it sounds. One of the most common encoded special characters is the space character (Macintosh owners, take special notice), whose position in the character set is 20
hexadecimal. To encode a space in a URL, replace it with %20: http://www.kumquat.com/new%20pricing.html This URL actually retrieves a document named
new pricing.html
from the server.
7.2.1.1 Handling reserved and unsafe characters
In addition to the nonprinting characters, you'll need to encode reserved and unsafe characters in your URLs as well.
Reserved characters are those characters that have a specific meaning within the URL itself. For example, many URLs use the slash character to separate elements of a pathname within the URL. If you need to include a slash in a URL that is not intended to be an element separator, you'll need to encode it as %2f:
http://www.calculator.com/compute?3%2f4
This URL actually references the resource named
compute
on the
www.calculator.com
server and passes the string 3/4 to it, as delineated by the question mark (?). Presumably, the resource is actually a server-side program that performs some arithmetic function on the passed value and returns a result.
Unsafe characters are those that have no special meaning within the URL, but may have a special meaning in the context in which the URL is written. For example, the double quotation mark character (") is used to delimit URLs in many HTML tags. If you were to include a double quotation mark directly in a URL, you would probably confuse the HTML browser. Instead, encode the double quotation mark as %22 to avoid any possible conflict.
Other reserved and unsafe characters that should always be encoded are shown in
Table 7.1
.
Table 7.1: Reserved and Unsafe Characters and Their URL
Encodings
Character Description
Usage
Encoding
Semicolon
Reserved %3B
;
Slash
Reserved %2F
/
Question mark
Reserved %3F
?
Colon
Reserved %3A
:
At sign
Reserved %40
@
Equal sign
Reserved %3D
=
Ampersand
Reserved %26
&
Less than sign
Unsafe