N1896Rev

ISO/IEC JTC1/SC18/WG8

Document Processing and Relating Communication—

Document Description and Processing Languages

TITLE: TC for Extended Naming Rules for SGML
SOURCE:WG8
PROJECT:JTC1.18.15.1
PROJECT EDITOR: Charles F. Goldfarb
STATUS:ISO Approved Final Text
ACTION:For information
SUMMARY OF MAJOR POINTS:This Technical Corrigendum adds a brief annex to ISO 8879 to meet an urgent need for extended naming rules for non-Latin scripts in support of the following statements in clause 0.2:
  1. There must be no national language bias.

    The characters used for names can be augmented by any special national characters.

This TC does not affect existing SGML documents or products. It affects only those SGML documents and products that choose to support the extended naming rules option.

DATE:3 Dec 1996
DISTRIBUTION: WG8 and Liaisons
REFER TO:ISO 8879
REPLY TO:Dr. James D. Mason
(ISO/IEC JTC1/SC18/WG8 Convenor)
Oak Ridge National Laboratory
Information Management Services
Bldg. 2506, M.S. 6302, P.O. Box 2008
Oak Ridge, TN 37831-6302 U.S.A.
Telephone: +1 423 574-6973
Facsimile: +1 423 574-6983
Network: masonjd@ornl.gov
http://www.ornl.gov/sgml/wg8/wg8home.htm
ftp://ftp.ornl.gov/pub/sgml/wg8/

TC for Extended Naming Rules

Add the following normative annex to ISO 8879.

Annex J (normative)
Extended Naming Rules

This annex describes an optional extension of SGML known as the "Extended Naming Rules". The extension should be used only in SGML documents for which the normal naming rules are unsuitable (usually because of the size of the natural language character set). An SGML system need not support these Extended Naming Rules in order to be a conforming SGML system.

This annex is phrased in terms of revisions to be made to the body of this International Standard. However, these revisions are applicable only when the Extended Naming Rules are in use.

To distinguish SGML declarations that use this extension from those that do not, the minimum literal in productions [171] and [200] of ISO 8879:1986 must be modified to read "ISO 8879:1986 (ENR)". To accomplish this add the following sentence to the paragraph immediately following production [171] and to the second paragraph following production [200]:

However, when extended naming rules are used, the minimum data must be "ISO 8879:1986 (ENR)".

The extended naming rules are as follows.

For many languages the distinction made in production [189] between uppercase and lowercase is not relevant. It is, therefore, necessary to modify clause 13.4.5 to allow for both an extended character set and for the use of character sets that do not have different cases. The changes required, in the order of their occurrence in 13.4.5, are:

  1. Replace production [189] with:
    [189] naming rules =
     "NAMING", ps+,
     "LCNMSTRT", (ps+, extended naming value)+, ps+,
     "UCNMSTRT", (ps+, extended naming value)+, ps+,
     ("NAMESTRT", (ps+, extended naming value)+, ps+)?,
     "LCNMCHAR", (ps+, extended naming value)+, ps+,
     "UCNMCHAR", (ps+, extended naming value)+, ps+,
     ("NAMECHAR", (ps+, extended naming value)+, ps+)?,
     "NAMECASE", ps+,
     "GENERAL", ps+, ("NO"| "YES"), ps+,
     "ENTITY", ps+, ("NO"| "YES") 
  2. In the "where" list change each occurrence of the phrase "in the literals (if any)" to "identified by the extended naming value (if any)"
  3. Add two new keywords to the "where" list:
    NAMESTRT
    means that each character identified by the extended naming value (if any) is assigned both to LCNMSTRT and as the associated upper-case form in UCNMSTRT.
    NAMECHAR
    means that each character identified by the extended naming value (if any) is assigned both to LCNMCHAR and as the associated upper-case form in UCNMCHAR.
  4. At the end of the clause, add:

    [189.1] extended naming value = parameter literal | character number | character range

    A character number may be used to specify a character that is defined in the syntax-reference character set but is not permitted in an SGML declaration.

    [189.2] character range = character number, ps*, minus, ps*, character number

    Specifying a character range is equivalent to specifying every character number from (and including) the character number that starts the range to (and including) the character number that ends the range.