Required, if you do not already know HTML: Raggett, Getting Started with HTML. Read this tutorial first if you do not know HTML. Note: Raggett's tutorial is for HTML 4.01, but it is mostly compatible with XHTML. See my remarks in these notes on the differences between HTML and XHTML.
The WWW is not the same as the Internet: it is just one part of the Internet. We have been studying Internet applications all along: sockets, RPC, etc., all are tools for building Internet applications. We are now studying, more specifically, web applications.
An intranet is a private network based on Internet protocols (TCP/IP, UDP, etc.). Whatever tools we have for building Internet applications, therefore, can also be used to build intranet applications.
Markup languages use markers called tags to annotate the text of a document, either to specify its structure and content (descriptive markup), or to specify its presentation, i.e., layout and appearance (procedural markup), or both.
The original HTML is a combined descriptive and procedural markup language. If you do not know HTML, or need a refresher, read the tutorial by Raggett before continuing.
With the experience gained from early versions of HTML, Tim Berners-Lee, the inventor of the WWW, has decided that it is better to separate the content from the presentation. This has led to the development of XML for content description and CSS for presentation style.
XML is a meta-language for creating markup languages. Its syntax resembles that of HTML, but the language designer is free to create new tags. The main goal of XML is to provide a basis for the semantic web, an evolution of the WWW in which documents label their content in such a way that it can be easily processed by programs.
Languages based on XML are called XML applications. Examples of XML applications include MathML (for describing mathematical formulas) and SVG (Scalable Vector Graphics). And yes—XML-RPC.
Stylistic markup is excluded from XML; instead, the XML document may link to a separate CSS style sheet.
CSS is a language for writing style rules. For example, one can write a CSS rule that says the background color of a
table element should be red, or that a
p (paragraph) element should use a certain font family and size. An XML document may include a link to a CSS file, which provides guidelines for its presentation.
XHTML is a reformulation of HTML 4 based on XML. Its goals are to separate style from content (the style is provided by CSS), and to regularize the syntax of the markup so that it can be processed by programs more easily and reliably.
Principal differences from HTML:
In XHTML, as in XML, all tags come in pairs: an element consists of a pair of tags and the content (if any) sandwiched between. For example, a paragraph (p) element must be like this in XHTML:
<p>I saw a rabbit.</p>
In (pre-X) HTML, the closing tag could be omitted:
<p>I saw a rabbit.
For empty document elements (elements with no content between the tags), there is a shortcut: for example, we may write
<br/> instead of
Elements must be properly nested in XHTML: if element A contains the beginning tag of element B, then it must also contain the ending tag of element B. In HTML you might get away with something like this:
<p>Mary had a little lamb; its fleece was <em>white as snow.</p> <p>And everywhere</em> that Mary went, the lamb was sure to go.</p>
But in XHTML we would have to write:
<p>Mary had a little lamb; its fleece was <em>white as snow.</em></p> <p><em>And everywhere</em> that Mary went, the lamb was sure to go.</p>
All tag attributes must be enclosed in quotation marks. Attributes are extra information that reside in the tag, rather than between tags. For example, we must write
i(italic letters), and
b(boldface); and the
tableelement no longer has a
borderattribute; these are now the concern of CSS. However, the tags
strong(strong emphasis) remain, and are usually rendered the same ways as
In most elements, the
name attribute has been replaced by the
id attribute. Any element with an
id attribute now can be the target of a link into the document: instead of
<p>The hunter told a <a href="#fox-story">story about a fox.</p> ... <a name="fox-story"><p>The clever, brown fox wagged its tail.</p></a>
we can now write simply
<p>The hunter told a <a href="#fox-story">story about a fox.</p> ... <p id="fox-story">The clever, brown fox wagged its tail.</p>
In addition, XHTML documents should begin with an XML declaration, a DOCTYPE declaration, and an XML namespace (
xmlns) attribute in the
html tag, such as the following
<?xml version="1.0" encoding="utf-8"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
instead of just
The World Wide Web Consortium (W3C) is now developing HTML 5 as the successor of both HTML 4 and XHTML. Surprisingly, HTML 5 will allow but not require XML; it is designed to be forgiving of syntax errors.
HTML 5 does requires no xml declaration, and only a very simple doctype declaration:
However, without the encoding from the xml declaration, at HTML 5 document needs to declare its own encoding in a
<meta> element in the
<head>, like this:
<head> <meta charset="utf-8" /> <title>Document Title</title> ... </head>
Mark Pilgrim et al.'s Dive into HTML5 is an excellent guide to the new features of HTML5 (for those who already know some earlier version of HTML).