Semantics test case

This is a test-case page. As you can see, the markup is very sparse, and no CSS has been applied – we're trying to keep the markup as simple as possible here, to help explain about semantics.

The whole point of semantics is to allow computers to read and "understand" web pages. By "understand", we mean "able to extract useful information out of the page, with context", such as being able to extract a contact address out of a page, knowing that it's the contact address of the webmaster.

Semantic mistakes

Table 1: Commonly misused XHTML tags
Tag(s) Common misuses Correct use
Tag(s) Common misuses Correct use
<abbr>, <acronym> Often mistaken for each other, and for describing text that isn't an abbreviation or an acronym. The <abbr> tag should be used to describe an abbreviation, such as Inc. or Ms., and the <acronym> tag should be used to describe an acronym, such as WWW or XML.
<address> The <address> tag is almost always mistaken as a tag to surround a postal address. ...to supply contact information for a document or a major part of a document such as a form.(http://www.w3.org/TR/html401/struct/global.html#h-7.5.6)
<blockquote> Usually mistakenly used to just indent large volumes of text. Should be surrounding a block of quoted text, maybe coupled with a <cite> tag.
<cite> When mistaken, it is usually mistakenly used to quote text, instead of a <q> or <blockquote> tag. Should be used to define the author of a prepended quote.
<del> Usually mistakenly used to make text strikethrough. Should be used to show that the page has been edited, and some text has been "removed"; this is shown by adding strikethrough to the text, not the other way round.
<dl>, <dt>, <dd> Sometimes incorrectly used to display, for example, chat logs (using <dt> to display the speaker's name, then <dd> to display their message). Should be used to display terms (<dt>) and their definitions (<dd>) in a list (<dl>).
<em> Mistakenly used to make text italic. Should be used to emphasise text, as you would when reading out the text aloud. Although italics and emphasis commonly go together, the common mistake is to get the order the wrong way round: you make things italic to show they're emphasised, not make things emphasised because they're italic.
<font> In XHTML, it is banned, as it provides no semantic meaning whatsoever. When used, it is usually used to format text in a way that is provided by other, more semantically appropriate, tags. N/A
<h1> to <h6> Used innumerable times to style text as big, little, medium-sized, bold, italic, etc. Should be used as headings to structure the page; if used correctly, web spiders are able to crawl the page, and extract a table of contents by just looking at the heading elements.
<img> Sometimes used in the XHTML to add layout images, such as backgrounds. Should be used to add content-relevant images (inline images, if you will), not layout images, which should be added through CSS.
<pre> Used to format and style text in the webpage. Should be used to include pre-formatted text in a webpage without modification.
<span> Used to format and style text in the webpage, sometimes in ways that are catered for by other, more semantically precise, tags. Should be used to style elements of the page that can't be styled semantically. A bar in an article that states the publishing date, for example.
<strong> Used to make text bold. Should be used if the text should be read out louder. The common mistake is to get the ideology the wrong way round: you make text bold to show that it's to be read out loudly, not read text out loudly because it's bold.
<table> Used ridiculously often to layout pages and designs, position elements, and align things. This is not correct use of the <table> tag! Should be used for tabular data, such as results of an experiment, or lists of incorrect uses of semantically-important XHTML elements.
<title> Used to display text other than what the page title is, such as smilie faces (:-)), adverts, and the author's name. Should be used to display the page title only. This can include the site title, however.

Feelings on semantics

Tantek Çelik once said:

Glossary

Semantics
Making tags describe what they contain.
XHTML
A markup language for the internet which is XML-compliant.

Last updated 28/01/06: Updating links due to moved domain.

Please contact the webmaster at
drbob at tecnocode dot co dot uk
if you have any problems with this page.