[IMAGE] IU SLIS logo

Demo page logo

What is HTML?

According to the World Wide Web Consortium (W3C)(Raggett, Jacobs, and Ishiwaka, 1997):

HTML is the lingua franca for publishing hypertext on the World Wide Web. It is a non-proprietary format based upon SGML, and can be created and processed by a wide range of tools, from simple plain text editors - you type it in from scratch- to sophisticated WYSIWYG authoring tools. HTML uses tags such as <h1> and </h1> to structure text into headings, paragraphs, lists, hypertext links etc.

HTML as an SGML Application

The various versions of HTML are SGML applications conforming to International Standard ISO 8879 -- Standard Generalized Markup Language. As an SGML application, the syntax of conforming HTML documents is defined by the combination of the SGML declaration and the document type definition (dtd).

This specification defines the intended interpretation of HTML elements, and places further constraints on the permitted syntax which are otherwise inexpressible in the dtd.

HTML as a universally understood publishing language

Part of the attractiveness of HTML is that is can be processed by all web browsers on almost any computer. It is a publishing language which is platform independent, although it can be processed by each browser and computer in a different way. This means that designers and HTML authors have limited control over the way in which the user sees their work

As a publishing language, HTML is an open standard. This means that it is based on "non-proprietary specifications" and is "free for anyone to use" (Raggett, Lam, and Alexander, 1996; 12). People and organizations all over the world have contributed to its development, although the bulk of the work has been done by a small group (primarily under the auspices of the W3C).

If you were to visit the W3C pages, you would find rich information about HTML and the web. You would also find pages where your input is solicited for comments on draft versions of new HTML specifications.

What HTML is and is not

HTML is a set of tags that can be used to mark up documents so that they can be viewed on the WWW. One strength of this language is that it works across computing platforms, operating systems, and browser software. HTML also allows you to embed images, audio, video and other multimedia into web pages.

The language is composed of tags that have two main functions:

  1. They can be used to lay out text and images on a "page" (actually a browser window on a computer screen);
  2. They can be used to create hyperlinks which connect text documents, to images, multimedia, software, and databases all over the Internet.

HTML is not a programming language. It is not a word processing language and isn't much use when the goal is to produce desktop publishing documents. If you keep in mind that the main purposes of HTML are to allow you to structure digital documents for the web and link to a variety of other digital resources, you'll do fine as you move through these pages learning how to markup your work for the web.

As you immerse yourself in HTML, you'll quickly discover that there will often be several ways to get the same display effects; for example, both the <address> and the <I> tags will cause the browser to display text in italics. Par of learning how to mark up documents is understanding when and when not to use certain markup.

HTML is continuing to develop

What is HTML 4.01?

HTML 4.01 is the current recommendation and is specified in three "flavors". You specify the variant you are using with a <DOCTYPE> statement as the first line of your document. This allows a validator to match the validation to the version of HTML you are using.

Each variant has its own DTD (taken from the W3C HTML page):

It is an advance over HTML 3.2 because it gives the web designer greater control over forms, frames and tables, and all the benefits of scripts, style sheets and objects

One innovative advance is Cascading Style Sheets

CSS is a tool that allows the presentation of multiple web pages to be controlled from a single style sheet or template. This template contains the markup that sets such presentation features as:

One advantage of CSS is that it minimizes the work involved in maintaining large web sites since markup changes can be handled from the template

What is XHTML?

XHTML is the latest version of web markup language. There will never be another version of HTML. has been rewritten as an XML application, which means that it follows a specific and stricter set of rules that brings greeater standardization to the language.

As an XML application, there are a number of requirements that must be followed to generate valid web files.

What is DHTML?

DHTML makes use of advanced web standards to allow designers to develop web applications instead of static pages. According to Richmond (1998),

'Dynamic HTML' is typically used to describe the combination of HTML, style sheets and scripts that allows documents to be animated. Dynamic HTML allows a web page to change after it's loaded into the browser --there doesn't have to be any communication with the web server for an update.

You can think of it as 'animated' HTML. For example, a piece of text can change from one size or color to another, or a graphic can move from one location to another, in response to some kind of user action, such as clicking a button.

DHTML makes use of features of HTML 4.0, particularly Cascading Style Sheets, and programming/scripting languages such as Javascript

The elements of a web page are object that can be manipulated at any time on the client-side. Typically this occurs in response to a user's actions and is caused by a script.

The problem with DHTML at the moment is that Microsoft and Netscape implement it differently. Cross-platform DHTML is a real challenge!

What is XML?

XML is "extensible markup language", a standard proposed by the W3C in December 1997. XML is an open markup language, which means that the specific tags to be used in documents are not determined in advance by the standard. It is a "data format for structured document interchange on the Web" (W3C, 1997).

In this sense, XML is a "meta-language" which contains a set of rules that can be used to develop different types of markup languages for different purposes. According to Flynn (1998):

A regular markup language defines a way to describe information in a certain class of documents (eg HTML). XML lets you define your own customized markup languages for many classes of document. It can do this because it's written in SGML, the international standard metalanguage for markup languages.

XML is like HTML because both are implementations of SGML. It is different because it is a broader and more powerful implementation that makes use of many more of the features of SGML than does HTML.

XML is not replacing HTML - both will have their place, but XML will allow web designers much more flexibility, especially as they begin to develop their own domain specific markup languages.

Sources:

Flynn, P. (1998). Frequently Asked Questions about the Extensible Mark Language. The XML FAQ V 1.3

http://www.ucc.ie/xml/

Raggett, D., Jacobs, I., and Ishikawa, M. (1997). HyperText Markup Language Home Page

http://www.w3.org/MarkUp/

Raggett, D., Le Hors, A., Jacobs, I. (1998). HTML 4.0 Specification W3C Recommendation, revised on 24-Apr-1998

http://www.w3.org/tr/REC-html40/

Raggett, D., Lam, J., and Alexander, I. (1996). HTML 3: Electronic Publishing on the World Wide Web. Essex, UK: Addison-Wesley Longman.

Richmond, A. (1998). Dynamic HTML. Web Developer's Virtual Library.

http://wdvl.internet.com/Authoring/dhtml/

World Wide Web Consortium. (1997). Extensible Markup Language (XML)

http://www.w3.org/XML/

Return to the page about paired and unpaired tags.


Demo Page
Navigation:
DemoPage contents About HTML UNIX help HTML tags Lists Links Images
Imagemapping Tables Forms Frames Javascript CSS (style sheets) XML


Page by Howard Rosenbaum
Find me at hrosenba@indiana.edu http://www.slis.indiana.edu/hrosenba/www/Demo/HTML.html