Standard Generalized Markup Language



(SGML) A generic markup language for representing documents. SGML is an International Standard that describes the relationship between a document's content and its structure. SGML allows document-based information to be shared and re-used across applications and computer platforms in an open, vendor-neutral format. SGML is sometimes compared to SQL, in that it enables companies to structure information in documents in an open fashion, so that it can be accessed or re-used by any SGML-aware application across multiple platforms.

SGML is defined in "ISO 8879:1986 Information processing -- Text and office systems -- Standard Generalized Markup Language (SGML)", an ISO standard produced by JTC 1/SC 18 and amended by "Amendment 1:1988".

Unlike other common document file formats that represent both content and presentation, SGML represents a document's content data and structure (interrelationships among the data). Removing the presentation from content establishes a neutral format. SGML documents and the information in them can easily be re-used by publishing and non-publishing applications.

SGML identifies document elements such as titles, paragraphs, tables, and chapters as distinct objects, allowing users to define the relationships between the objects for structuring data in documents. The relationships between document elements are defined in a Document Type Definition (DTD). This is roughly analogous to a collection of field definitions in a database. Once a document is converted into SGML and the information has been 'tagged', it becomes a database-like document. It can be searched, printed or even programmatically manipulated by SGML-aware applications.

Companies are moving their documents into SGML for several reasons:

Reuse - separation of content from presentation facilitates multiple delivery formats like CD-ROM and electronic publishing.

Portability - SGML is an international, platform-independent, standard based on ASCII text, so companies can safely store their documents in SGML without being tied to any one vendor.

Interchange - SGML is a core data standard that enables SGML-aware applications to inter-operate and share data seamlessly.

A central SGML document store can feed multiple processes in a company, so managing and updating information is greatly simplified. For example, when an aeroplane is delivered to a customer, it comes with thousands of pages of documentation. Distributing these on paper is expensive, so companies are investigating publishing on CD-ROM. If a maintenance person needs a guide for adjusting a plane's flight surfaces, a viewing tool automatically assembles the relevant information from the document repository as a complete document. SGML can be used to define attributes to information stored in documents such as security levels.

There are few clear leaders in the SGML industry which, in 1993, was estimated to be worth US $520 million and is projected to grow to over US $1.46 billion by 1998.

A wide variety tools can be used to create SGML systems. The SGML industry can be separated into the following categories:

Mainstream Authoring consists of the key word processing vendors like Lotus, WordPerfect and Microsoft.

SGML Editing and Publishing includes traditional SGML authoring tools like ArborText, Interleaf, FrameBuilder and SoftQuad Author/Editor.

SGML Conversions is one of the largest sectors in the market today because many companies are converting legacy data from mainframes, or documents created with mainstream word processors, into SGML.

Electronic Delivery is widely regarded as the most compelling reason companies are moving to SGML. Electronic delivery enables users to retrieve information on-line using an intelligent document viewer.

Document Management may one day drive a major part of the overall SGML industry.

SGML Document Repositories is one of the cornerstone technologies that will affect the progress of SGML as a data standard.

Since 1998, almost all development in SGML has been focussed on XML - a simple (and therefore easier to understand and implement) subset of SGML.

"ISO 8879:1986//ENTITIES Added Latin 1//EN" defines some characters. [How are these related to ISO 8859-1?].

ISO catalogue entry.

SGML parsers are available from VU, NL, FSU, UIO, Norway.

See also sgmls.

Usenet newsgroup: news:comp.text.sgml.

["The SGML Handbook", Charles F. Goldfarb, Clarendon Press, 1991, ISBN 0198537379. (Full text of the ISO standard plus extensive commentary and cross-referencing. Somewhat cheaper than the ISO document)].

["SGML - The User's Guide to ISO 8879", J.M. Smith et al, Ellis Harwood, 1988].

[Example of some SGML?]