| | <?xml version='1.0' encoding='ISO-8859-5' standalone='no'?> |
| | <!DOCTYPE spec SYSTEM "dtds/spec.dtd" [ |
| | |
| | <!-- LAST TOUCHED BY: Tim Bray, 8 February 1997 --> |
| | |
| | <!-- The words 'FINAL EDIT' in comments mark places where changes |
| | need to be made after approval of the document by the ERB, before |
| | publication. --> |
| | |
| | <!ENTITY XML.version "1.0"> |
| | <!ENTITY doc.date "10 February 1998"> |
| | <!ENTITY iso6.doc.date "19980210"> |
| | <!ENTITY w3c.doc.date "02-Feb-1998"> |
| | <!ENTITY draft.day '10'> |
| | <!ENTITY draft.month 'February'> |
| | <!ENTITY draft.year '1998'> |
| | |
| | <!ENTITY WebSGML |
| | 'WebSGML Adaptations Annex to ISO 8879'> |
| | |
| | <!ENTITY lt "&#60;"> |
| | <!ENTITY gt ">"> |
| | <!ENTITY xmlpio "'<?xml'"> |
| | <!ENTITY pic "'?>'"> |
| | <!ENTITY br "\n"> |
| | <!ENTITY cellback '#c0d9c0'> |
| | <!ENTITY mdash "--"> <!-- —, but nsgmls doesn't grok hex --> |
| | <!ENTITY com "--"> |
| | <!ENTITY como "--"> |
| | <!ENTITY comc "--"> |
| | <!ENTITY hcro "&#x"> |
| | <!-- <!ENTITY nbsp "�"> --> |
| | <!ENTITY nbsp " "> |
| | <!ENTITY magicents "<code>amp</code>, |
| | <code>lt</code>, |
| | <code>gt</code>, |
| | <code>apos</code>, |
| | <code>quot</code>"> |
| | |
| | <!-- audience and distribution status: for use at publication time --> |
| | <!ENTITY doc.audience "public review and discussion"> |
| | <!ENTITY doc.distribution "may be dislributed freely, as long as |
| | all text and legal notices remain intact"> |
| | |
| | ]> |
| |
|
| | |
| | <?VERBATIM "eg" ?> |
| |
|
| | <spec> |
| | <header> |
| | <title>Extensible Markup Language (XML) 1.0</title> |
| | <version></version> |
| | <w3c-designation>REC-xml-&iso6.doc.date;</w3c-designation> |
| | <w3c-doctype>W3C Recommendation</w3c-doctype> |
| | <pubdate><day>&draft.day;</day><month>&draft.month;</month><year>&draft.year;</year></pubdate> |
| |
|
| | <publoc> |
| | <loc href="http://www.w3.org/TR/1998/REC-xml-&iso6.doc.date;"> |
| | http://www.w3.org/TR/1998/REC-xml-&iso6.doc.date;</loc> |
| | <loc href="http://www.w3.org/TR/1998/REC-xml-&iso6.doc.date;.xml"> |
| | http://www.w3.org/TR/1998/REC-xml-&iso6.doc.date;.xml</loc> |
| | <loc href="http://www.w3.org/TR/1998/REC-xml-&iso6.doc.date;.html"> |
| | http://www.w3.org/TR/1998/REC-xml-&iso6.doc.date;.html</loc> |
| | <loc href="http://www.w3.org/TR/1998/REC-xml-&iso6.doc.date;.pdf"> |
| | http://www.w3.org/TR/1998/REC-xml-&iso6.doc.date;.pdf</loc> |
| | <loc href="http://www.w3.org/TR/1998/REC-xml-&iso6.doc.date;.ps"> |
| | http://www.w3.org/TR/1998/REC-xml-&iso6.doc.date;.ps</loc> |
| | </publoc> |
| | <latestloc> |
| | <loc href="http://www.w3.org/TR/REC-xml"> |
| | htt����www.w3.org/TR/REC-xml</loc> |
| | </latestloc> |
| | <prevlocs> |
| | <loc href="http://www.w3.org/TR/PR-xml-971208"> |
| | http://www.w3.org/TR/PR-xml-971208</loc> |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | </prevlocs> |
| | <authlist> |
| | <author><name>Tim Bray</name> |
| | <affiliation>Textuality and Netscape</affiliation> |
| | <email |
| | href="mailto:tbray@textuality.com">tbray@textuality.com</email></author> |
| | <author><name>Jean Paoli</name> |
| | <affiliation>Microsoft</affiliation> |
| | <email href="mailto:jeanpa@microsoft.com">jeanpa@microsoft.com</email></author> |
| | <author><name>C. M. Sperberg-McQueen</name> |
| | <affiliation>University of Illinois at Chicago</affiliation> |
| | <email href="mailto:cmsmcq@uic.edu">cmsmcq@uic.edu</email></author> |
| | </authlist> |
| | <abstract> |
| | <p>The Extensible Markup Language (XML) is a subset of |
| | SGML that is completely described in this document. Its goal is to |
| | enable generic SGML to be served, received, and processed on the Web |
| | in the way that is now possible with HTML. XML has been designed for |
| | ease of implementation and for interoperability with both SGML and |
| | HTML.</p> |
| | </abstract> |
| | <status> |
| | <p>This document has been reviewed by W3C Members and |
| | other interested parties and has been endorsed by the |
| | Director as a W3C Recommendation. It is a stable |
| | document and may be used as reference material or cited |
| | as a normative reference from another document. W3C's |
| | role in making the Recommendation is to draw attention |
| | to the spPcification and to promote its widespread |
| | deployment. This enhances the functionality and |
| | interoperability of the Web.</p> |
| | <p> |
| | This document specifies a syntax created by subsetting an existing, |
| | widely used international text processing standard (Standard |
| | Generalized Markup Language, ISO 8879:1986(E) as amended and |
| | corrected) for use on the World Wide Web. It is a product of the W3C |
| | XML Activity, details of which can be found at <loc |
| | href='http://www.w3.org/XML'>http://www.w3.org/XML</loc>. A list of |
| | current W3C Recommendations and other technical documents can be found |
| | at <loc href='http://www.w3.org/TR'>http://www.w3.org/TR</loc>. |
| | </p> |
| | <p>This specification uses the term URI, which is defined by <bibref |
| | ref="Berners-Lee"/>, a work in progress expected to update <bibref |
| | ref="RFC1738"/> and <bibref ref="RFC1808"/>. |
| | </p> |
| | <p>The list of known errors in this specification is |
| | available at |
| | <loc href='http://www.w3.org/XML/xml-19980210-errata'>http://www.w3.org/XML/xml-19980210-errata</loc>.</p> |
| | <p>Please report errors in this document to |
| | <loc href='mailto:xml-editor@w3.org'>xml-editor@w3.org</loc>. |
| | </p> |
| | </status> |
| |
|
| |
|
| | <pubstmt> |
| | <p>Chicago, Vancouver, Mountain View, et al.: |
| | World-Wide Web Consortium, XML Working Group, 1996, 1997.</p> |
| | </pubstmt> |
| | <sourcedesc> |
| | <p>Created in electronic form.</p> |
| | </sourcedesc> |
| | <langusage> |
| | <language id='EN'>English</language> |
| | <language id='ebnf'>Extended Backus-Naur Form (formal grammar)</language> |
| | </langusage> |
| | <revisiondesc> |
| | <slist> |
| | <sitem>1997-12-03 : CMSMcQ : yet further changes</sitem> |
| | <sitem>1997-12-02 : TB : further changes (see TB to XML WG, |
| | 2 December 1997)</sitem> |
| | <sitem>1997-12-02 : CMSMcQ : deal with as many corrections and |
| | comments from the proofreaders as possible: |
| | entify hard-coded document date in pubdate element, |
| | change expansion of entity WebSGML, |
| | update status description as per Dan Connolly (am not sure |
| | about refernece to Berners-Lee et al.), |
| | add 'The' to abstract as per WG decision, |
| | move Relationship to Existing Standards to back matter and |
| | combine with References, |
| | re-order back matter so normative appendices come first, |
| | re-tag back matter so informative appendices are tagged informdiv1, |
| | remove XXX XXX from list of 'normative' specs in prose, |
| | move some references from Other References to Normative References, |
| | add RFC 1738, 1808, and 2141 to Other References (they are not |
| | normative since we do not require the processor to enforce any |
| | rules based on them), |
| | add reference to 'Fielding draft' (Berners-Lee et al.), |
| | move notation section to end of body, |
| | drop URIchar non-terminal and use SkipLit instead, |
| | lose stray reference to defunct nonterminal 'markupdecls', |
| | move reference to Aho et al. into appendix (Tim's right), |
| | add prose note saying that hash marks and fragment identifiers are |
| | NOT part of the URI formally speaking, and are NOT legal in |
| | system identifiers (processor 'may' signal an error). |
| | Work through: |
| | Tim Bray reacting to James Clark, |
| | Tim Bray on his own, |
| | Eve Maler, |
| |
|
| | NOT DONE YET: |
| | change binary / text to unparsed / parsed. |
| | handle James's suggestion about < in attriubte values |
| | uppercase hex characters, |
| | namechar list, |
| | </sitem> |
| | <sitem>1997-12-01 : JB : add some column-width parameters</sitem> |
| | <sitem>1997-12-01 : CMSMcQ : begin round of changes to incorporate |
| | recent WG decisions and other corrections: |
| | binding sources of character encoding info (27 Aug / 3 Sept), |
| | correct wording of Faust quotation (restore dropped line), |
| | drop SDD from EncodingDecl, |
| | change text at version number 1.0, |
| | drop misleading (wrong!) sentence about ignorables and extenders, |
| | modify defin�����������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������xamples with Byte Order Mark. |
| | Add content model as a term and clarify that it applies to both |
| | mixed and element content. |
| | </sitem> |
| | <sitem>1997-06-30 : CMSMcQ : change date, some cosmetic changes, |
| | changes to productions for choice, seq, Mixed, NotationType, |
| | Enumeration. Follow James Clark's suggestion and prohibit |
| | conditional sections in internal subset. TO DO: simplify |
| | production for ignored sections as a result, since we don't |
| | need to worry about parsers whi<! don't expand PErefs finding |
| | a conditional section.</sitem> |
| | <sitem>1997-06-29 : TB : various edits</sitem> |
| | <sitem>1997-06-29 : CMSMcQ : further changes: |
| | Suppress old FINAL EDIT comments and some dead material. |
| | Revise occurrences of % in grammar to exploit Henry Thompson's pun, |
| | especially markupdecl and attdef. |
| | Remove RMD requirement relating to element content (?). |
| | </sitem> |
| | <sitem>1997-06-28 : CMSMcQ : Various changes for 1 July draft: |
| | Add text for draconian error handling (introduce |
| | the term Fatal Error). |
| | RE deleta est (changing wording from |
| | original announcement to restrict the requirement to validating |
| | parsers). |
| | Tag definition of validawwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww it meant 'may or may not'.</sitem> |
| | <sitem>1997-03-21 : TB : massive changes on plane flight from Chicago |
| | to Vancouver</sitem> |
| | <sitem>1997-03-21 : CMSMcQ : correct as many reported errors as possible. |
| | </sitem> |
| | <sitem>1997-03-20 : CMSMcQ : correct typos listed in CMSMcQ hand copy of spec.</sitem> |
| | <sitem>1997 James Clark: |
| | Define the set of characters from which [^abc] subtracts. |
| | Charref should use just [0-9] not Digit. |
| | Location info needs cleaner treatment: remove? (ERB |
| | question). |
| | One example of a PI has wrong pic. |
| | Clarify discussion of encoding names. |
| | Encoding failure should lead to unspecified results; don't |
| | prescribe error recovery. |
| | Don't require exposure of entity boundaries. |
| | Ignore white space in element content. |
| | Reserve entity names of the form u-NNNN. |
| | Clarify relative URLs. |
| | And some of my own: |
| | Correct productions for content model: model cannot |
| | consist of a name, so "elements ::= cp" is no good. |
| | </sitem> |
| | <sitem>1996-11-11 : CMSMcQ : revise for style. |
| | Add new rhs to entity declaration, for parameter entities.</sitem> |
| | <sitem>1996-11-10 : CMSMcQ : revise for style. |
| | Fix / complete section on names, characters. |
| | Add sections on parameter entities, conditional sections. |
| | Still to do: Add compatibility note on deterministic content models. |
| | Finish stylistic revision.</sitem> |
| | <sitem>1996-10-31 : TB : Add Entity Handling section</sitem> |
| | <sitem>1996-10-30 : TB : Clean up term & termdef. Slip in |
| | ERB decision re EMPTY.</sitem> |
| | <sitem>1996-10-28 : TB : Change DTD. Implement some of Michael's |
| | suggestions. Change comments back to //. Introduce language for |
| | XML namespace reservation. Add section on white-space handling. |
| | Lots more cleanup.</sitem> |
| | <sitem>1996-10-24 : CMSMcQ : quick tweaks, implement some ERB |
| | decisions. Characters are not integers. Comments are /* */ not //. |
| | Add bibliographic refs to 10646, HyTime, Unicode. |
| | Rename old Cdata as MsData since it's <emph>only</emph> seen |
| | in marked sections. Call them attribute-value pairs not |
| | name-value pairs, except once. Internal subset is optional, needs |
| | '?'. Implied attributes should be signaled to the app, not |
| | have values supplied by processor.</sitem> |
| | <sitem>1996-10-16 : TB : track down & excise all DSD references; |
| | introduce some EBNF for entity declarations.</sitem> |
| | <sitem>1996-10-?? nsistency check, fix up scraps so |
| | they all parse, get formatter working, correct a few productions.</sitem> |
| | <sitem>1996-10-10/11 : CMSMcQ : various maintenance, stylistic, and |
| | organizational changes: |
| | Replace a few literals with xmlpio and |
| | pi""entities, to make them consistent and ensure we can change pic |
| | reliably when the ERB votes. |
| | Drop paragraph on recognizers from notation section. |
| | Add match, exact match to terminology. |
| | Move old 2.2 XML Processors and Apps into intro. |
| | Mention comments, PIs, and marked sections in discussion of |
| | delimiter escaping. |
| | Streamline discussion of doctype decl syntax. |
| | Drop old section of 'PI syntax' for doctype decl, and add |
| | section on partial-DTD summary PIs to end of Logical Structures |
| | section. |
| | Revise DSD syntax section to use Tim's subset-in-a-PI |
| | mechanism.</sitem> |
| | <sitem>1996-10-10 : TB : eliminate name recognizers (and more?)</sitem> |
| | <sitem>1996-10-09 : CMSMcQ : revise for style, consistency through 2.3 |
| | (Characters)</sitem> |
| | <sitem>1996-10-09 : CMSMcQ : re-unite everything for convenience, |
| | at least temporarily, and revise quickly</sitem> |
| | <sitem>1996-10-08 : TB : first major homogenization pass</sitem> |
| | <sitem>1996-10-08 : TB : turn "current" attribute on div type into |
| | CDATA</sitem> |
| | <sitem>1996-10-02 : TB : remould into skeleton + entities</sitem> |
| | <sitem>1996-09-30 : CMSMcQ : add a few more sections prior to exchange |
| | with Tim.</sitem> |
| | <sitem>1996-09-20 : CMSMcQ : finish transcribing notes.</sitem> |
| | <sitem>1996-09-19 : CMSMcQ : begin transcribing notes for draft.</sitem> |
| | <sitem>1996-09-13 : CMSMcQ : made outline from notes of 09-06, |
| | do some housekeeping</sitem> |
| | </slist> |
| | </revisiondesc> |
| | </header> |
| | <�����������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������m> is used to read XML documents |
| | and provide access to their content and structure.</termdef> <termdef |
| | id="dt-app" term="Application">It is @ssumed that an XML processor is |
| | doing its work on behalf of another module, called the |
| | <term>application</term>.</termdef> This specification describes the |
| | required beh\vior of an XML processor in terms of how it must read XML |
| | data and the information it must provide to the application.</p> |
| | |
| | <div2 id='sec-origin-goals'> |
| | <head>Origin and Goals</head> |
| | <p>XML was developed by an XML Working Group (orisable over the |
| | Internet.</p></item> |
| | <item><p>XML shall support a wide varie�y of applications.</p></item> |
| | <item><p>XML shall be compatible with SGML.</p></item> |
| | <item><p>It shall be easy to write programs which process XML |
| | documents.</p></item> |
| | <item><p>The number of optional features in XML is to be kept to the |
| | absolute minimum, ideally zero.</p></item> |
| | <item><p>XML documents shou |