Chewing my way through the latest XML 1.0 specification, and an XML document is defined as follows:
[1] document ::= prolog element Misc*
...
[22] prolog ::= XMLDecl? Misc* (doctypedecl Misc*)?
[23] XMLDecl ::= '<?xml' VersionInfo EncodingDecl? SDDecl? S? '?>'
...
[28] doctypedecl ::= '<!DOCTYPE' S Name (S ExternalID)? S? ('[' intSubset ']' S?)? '>'
The spec states that
-
[Definition: An XML document is valid if it has an associated document type declaration and if the document complies with the constraints expressed in it.]
and well-formed if "It meets all the well-formedness constraints given in this specification." (see definition).
The definition of document type declaration has two well-formedness constraints and one validity constraint so if it's omitted the XML document cannot be considered a valid.
There is a minimal XML document example in there,
<?xml version="1.0"?>
<greeting>Hello, world!</greeting>
and I understand why it is well-formed but not valid, but it still doesn't explain how the DTD can be optional if it is required for an XML document to be valid.
Background for this question
Started reading the XML spec because wanted to get a better understanding before getting into DocBook 5 but it's manual states that "DocBook V5.0 is thus defined using a powerful schema language called RELAX NG" so it "does not depend on DTDs anymore", and the example shown completely omits the DTD too.
CodePudding user response:
The W3C XML Recommendation only defines one type of XML schema: DTD. Others exist: XSD, Relax NG, and Schematron are other XML schemas. In fact, DTD is rarely used to define modern XML schemas due to its limited expressiveness.
The concept of validity has been extended to apply to all XML schemas: An XML document is said to be valid against an XML schema if it adheres to the grammar and content constraints defined by the schema.
- A DTD can be omitted for the same reason that an XML document need not be associated with any XML schema: Adherence to the rules of well-formedness is often sufficient for applications.
- An XML declaration can be omitted because its values defaults are sufficient to support the well-formedness rules throughout the rest of the Recommendation.
See also
CodePudding user response:
explain how the DTD can be optional if it is required for an XML document to be valid.
Well, validity is optional, therefore the DTD is optional.
I think you're reading too much into the word "valid". Let's suppose that instead of calling it "validity", they had called it "cuteness". A document is cute if it has a DTD and matches the rules defined in that DTD. Not all documents are cute; cuteness is optional, therefore the DTD is optional.
As for your final paragraph about DocBook and RelaxNG. Validity as defined/described in the XML spec means DTD-based validity. The wider extended concept of validity allows the document structure to be defined in a constraint language other than DTD, for example XSD or RelaxNG. A document with no Doctype/DTD cannot be valid in the narrow sense of the XML spec, but it can be valid in the wider sense that allows for alternative schema languages.