For several months now I have been conducting an experiment known as the "SGML Purity Test". Its objectives were
The results of the experiment were mixed. While the first objective has been well-received, there were several problems with the second. The methodology proved to be unworkable in practice and, what is worse, was perceived to be "anti-vendor".
That alone is sufficient reason to abandon the experiment. SGML users and vendors have always worked together for the furtherance of the technology. For example, although I'm the Honorary Technical Adviser to the SGML Users' Group, I'm also an active member of the vendor consortium SGML Open and served on the Board of Industry Advisors that helped it get started. And both vendors and users participate in the standardization activities that I lead.
The very name "purity test" was also a source of difficulty. The term "purity" is undeniably meaningful in a technical context, where the term "information purity" neatly describes SGML's unique ability to distinguish the "true information" content of a document from the "style information" that is used to render it. However, in the context of product categorization, the term proved to be inappropriate and confusing.
Another problem, related to the educational objective, was that the site put all its emphasis on just one vital, but poorly-understood aspect of SGML. There are several others, equally deserving of public attention.
So this site is in the process of being transformed. It will now be devoted to important but InFrequently Asked Questions (InFAQs) about SGML. The purity test has been terminated. The youths, maidens, and unicorns have returned to the Forest Primeval where they can forever wander the paths, dance round the maypole, mix metaphors, and bask in the sunshine forever. There never was a place for them in the real world.
SGML is designed to make your information last longer than the systems that created it. Such longevity also implies immunity to short-term changes -- such as a change from one application program to another -- so SGML is also inherently designed for re-purposing and portability. And the same technical characteristics of SGML that make these long-term benefits possible also provide near-term benefits in document production: shorter lead times, lower costs, more flexible processing, and better control.
But the real key to SGML's success -- both politically and technically -- is the fact that SGML is a bona fide International Standard, not the creation of a dominant vendor or a consortium. I say "politically" because large users feel they can safely invest millions to convert to SGML because the SGML specification is stable and is maintained by a neutral organization. I say "technically" because the concept of conformance to a standard is what makes SGML work.
Here's how conformance works. The SGML standard defines the requirements for "conforming SGML documents". These requirements are remarkably flexible. In fact, SGML isn't so much a standard for "what you have to do" as a standard for "describing what you've done and why you chose to do it". (So SGML conformance doesn't force you to be a conformist!)
The standard also sets requirements for "conforming SGML systems" -- but these are defined principally in terms of their ability to process conforming SGML documents. The objective is for the user to have a library of conforming SGML documents and be able to use any conforming SGML systems to process those documents in a multitude of ways -- regardless of how many previous processes have taken place.
Such a demanding set of objectives for SGML has necessarily resulted in a non-trivial language design. SGML has some subtle details, and the implications of failing to address them properly in products are not as widely understood as they should be. So, just as it is vital to the effectiveness of SGML that conformance be defined rigorously, it is equally important that conforming products be identifiable unambiguously. For this reason, the SGML standard requires a conforming product to be identified prominently as "An SGML System Conforming to International Standard ISO 8879 -- Standard Generalized Markup Language".
The standard calls such a product a "conforming SGML system". It is required to have a description of its SGML capabilities, including its ability to support optional features, in a standardized format called a "system declaration". A conforming product isn't forced to support any of the optional facilities of SGML, but if it does, it must support them according to the requirements of the standard.
A conforming product's documentation must also meet certain requirements, designed to minimize user retraining when starting to use additional SGML products. These requirements involve consistent use of standardized terminology, accurately distinguishing features of SGML from features of the product, and so on.
Note: Just claiming conformance doesn't prove conformance. Validating a product's conformance claims is the role of "SGML conformance testing", a rigorous process governed by an International Standard of its own. If you believe that a conforming SGML product doesn't actually conform, that is a bug you can report to the vendor.
To the standard, all other products are "non-conforming". You may see them described with terms like "SGML compliant", "standards-based", "SGML aware", etc., but these terms are not defined in ISO 8879. There is an ever increasing number of such products, with varying degrees of SGML support. Many products, though non-conforming, can process a large variety of conforming documents. Some may even have the necessary functionality for conformance, but don't formally claim it.
Certainly.
Many popular SGML products are non-conforming, and have proven highly useful -- even essential -- in SGML environments. Not all SGML users need to achieve all of the objectives that SGML is designed for. They are willing and able to trade off the benefits of conformance in favor of cost savings or other product functionality.
Only you can decide whether the reasons for a product's non-conformance are relevant to your own expected use of the product. These reasons could include lack of support for a facility of SGML that the standard considers to be mandatory, inability to recognize some detailed aspects of the markup, or documentation that doesn't meet the requirements of the standard.
A conforming product isn't necessarily the best product for your purposes. Many factors govern an intelligent choice of products for an SGML system. SGML conformance is only one of these, and it can never be the only one. You need to consider functionality, cost, performance, service, vendor reputation, and so on. And no product review or third-party recommendation can substitute for your own careful assessment of a product's applicability to your enterprise's unique requirements.
The major benefit of SGML conformance, to both users and vendors, has nothing to do with technicalities. It is that SGML is defined by a bona fide de jure International Standard -- maintained by a strong standards organization that is recognized by governments and whose standards have the force of law in many countries. Every web user has seen what happens to a standard that does not have such stability and authority -- even one of high quality and major importance. Dominant vendors try to run away with it, competitors are reduced to playing catch-up instead of competing equally, and users have to cope with multiple conflicting variations instead of a true standard. The procedures of the ISO and the national standards bodies that belong to it have protected SGML from that sort of chaos.
Well, if you can't wait to see the answer, I'll tell you, but it will be a while yet before I can provide more of an explanation. Here it is, in brief:
"Information purity" -- that trait of an information representation (data format, notation, et. al.) that allows its users to distinguish what they consider to be the "true information" content of a document from the "style information" that is used to render it, and that allows tools to enforce and preserve that distinction. Information purity is what allows SGML documents to be reusable and portable.
I'd like to thank Sarah Tourville and the rest of her team at SAGRELTO Enterprises, Inc. for building the site, with a special thank you to chief programmer Ron Picker. Peter Newcomb, Derek Denny-Brown, and the other folks at TechnoTeacher, Inc. provide the disk space, the high speed net connection, and site administration. Andrew Goldfarb, of Eye-Tech Graphics, provides artwork on demand. Sarah, Peter, and Derek donate their work because they believe in SGML. Andrew has no choice because he's my son. A-link Network Services provides my personal Internet access; they get paid.
Click here for the
SGML Source Home Page ...
Or else use your browser's back arrow button to return to the last
page you read.
Copyright (C)1996 Charles F. Goldfarb. All rights reserved. "SGML Source", "Infrequently Asked Questions" and "InFAQs" are service marks of Charles F. Goldfarb. I take no responsibility for the accuracy of the contents of this site. I've collected and disseminated this information in an attempt to be helpful, but you use it at your own risk. If you're smart, you won't use it at all without verifying it for yourself. For these reasons, and out of respect for intellectual property (including my own), information on this site cannot be used or cited for any commercial purpose. Any questions, comments, or suggestions? Send me mail.