Workshops --> XML--> syll --> intro.html
SLIS@IU logo

L595 XML@IU logo

Summer
2007

Room
Time
Instructor: Howard Rosenbaum
Office: 005B@SLIS
Office Hours:
BH 308 9:00-4:00 PM F, S, Su mail iconhrosenba@indiana.edu Telephone: 812 855 3250 11:00-1:00 PM T, Th

Introduction

First there was Standard Generalized Markup Language (SGML). This markup language has been used in the publishing industry for many years and became an International Standards Organization (ISO) standard (ISO 8879) in 1986. SGML is a tremendously complex language that provides great flexibility for those who can use it to prepare structured text documents. The specification is over 500 pages long and, according to the Cover Pages 1:

Conceived notionally in the 1960s - 1970s, the Standard Generalized Markup Language (SGML, ISO 8879:1986) gave birth to a profile/subset called the Extensible Markup Language (XML), published as a W3C Recommendation in 1998. Depending upon your perspective and requirements, the differences between SGML and XML are inconsequential or immense. SGML is more customizable (thus flexible and more "powerful") at the expense of being (much) more expensive to implement. ... For an overview of differences, see James Clark's document "Comparison of SGML and XML"; for other treatments, see references in XML and/versus SGML. As of 2002-07, relatively few enterprise-level projects are started as SGML applications, but many SGML applications implemented before 1999 are still running productively. In some cases, peculiar business requirements favor the use of SGML for certain features that have been eliminated in XML.

SGML is a cross platform language that is used to structure information in ways that allow easy exchange of documents without need for proprietary hardware or software. It is important because, according to Charles Goldfarb 2, one of its creators,

SGML is designed to make your information last longer than the systems that created it. Such longevity also implies immunity to short-term changes -- such as a change from one application program to another -- so SGML is also inherently designed for re-purposing and portability...

The SGML standard defines the requirements for "conforming SGML documents." These requirements are remarkably flexible. In fact, SGML isn't so much a standard for "what you have to do" as a standard for "describing what you've done and why you chose to do it".

HTML is an appication of SGML. It is a fairly rigid document type definition (DTD) of SGML that greatly simplifies the language. As a consequence, it has become a universally understood publishing language which all computers on the web can potentially understand. In spite of its ease of use as a tool for web-based information design, however, HTML has its limitations, in no small part because it is a language designed to affect a document's structure and not its appearance. In addition, there is the problem of the lack of control the designer has over how a given page is displayed on a person's machine.

One solution to these problems has been the introduction of cascading style sheets (CSS), by the World Wide Web Consortium. CSS separates structure from presentation and provide designers with the ability to control elements of HTML markup on many content pages from an external template. CSS is rule-based and uses a syntax to specify how a particular HTML element (affecting text, font selection, images, spacing, white space, color etc.) will appear; CSS markup can be applied to groups of HTML elements, which can be defined in non standard terms, nested HTML elements, and even discrete blocks of text.

However, CSS style sheets still work within the DTD for HTML and, although they extend the control of the designer, they still have the same limitations as does standard HTML.

XHTML is the latest version of HTML. It is a "reformulation of HTML 4 in XML 1.0" that became a recommendation in 20003 This means that XHTML is written as an application of XML and therefore follows all of the rules of XML. It is a cleaner and less forgiving version of HTML and as such, will be compatible with XML applications.

What is XML?

Extensible Markup Language (XML) is a subset of SGML. It has been designed to incorporate only those elements of SGML that are needed to prepare and deliver documents across the web or other communications infrastructure, such as an intranet. It is a language that is used to describe documents, not render them. XML is very powerful because is it a metalanguage which allows users to define their own tags and attributes that can be easily processed and displayed across platforms. An XML document is "self-describing", meaning that it contains all of the rules and tags necessary for it to be displayed. XML extends the power of markup beyond HTML because it incorporates new ways of handling styles (Extensible StyleSheet Language Transformations XSLT), links (XML Linking Language Linking Language XLink), filepaths (XML Path Language XPath), formatting (XSL Formatting Objects XSL-FO) and even querying (XQuery)

XML was developed by the SGML Editorial Board formed under the auspices of the W3C beginning in 1996. According to the W3C4, the design goals for XML are that:

This 1.5 credit workshop will provide you with an intensive, hands-on introduction to the use of XML to mark up and publish documents on the WWW (or on your web-based internal intranet). You will also gain a conceptual understanding of the structure, strengths, and weaknesses of XML, which will allow you to use this language effectively and efficiently.

XML is beginning to have an impact across the information professions. This workshop gives you the opportunity to see what the buzz is all about.

Prerequisites

There are prerequisites for this workshop.

SLIS students must have completed L 571 Information Architecture for the Web. Students outside of SLIS and people outside of the University must have the permission of the instructor to enroll.

References

1. Cover, R. (2006). SGML and XML as (Meta-) Markup Languages. Cover Pages

http://xml.coverpages.org/sgml.html

Return to text

2. Golfdfarb, C. (1997). Charles F. Goldfarb's SGML Source: InFrequently Asked Questions (InFAQs)

http://www.sgmlsource.com/infaqs.htm

Return to text

3. World Wide Web Consortium. (1998). XHTMLŞ 1.0 The Extensible HyperText Markup Language (Second Edition)

http://www.w3.org/TR/xhtml1/

Return to text

4. World Wide Web Consortium. (1998). Extensible Markup Language (XML) 1.0

http://www.w3.org/tr/1998/REC-xml-19980210#sec-origin-goals

Return to text


Return to Table of Contents or go to: Introduction Course Objectives Course Requirements Other Important Information Assignments
Grading Required Texts Workshop schedule Assignments/Due Dates (short)

Page by Howard Rosenbaum
Find me at hrosenba@indiana.edu http://www.slis.indiana.edu/hrosenba/www/Workshops/XML/syll/intro.html