Home > database >  XSD Schema: Boudaries / data-type for xs:maxLength for simple data-type restrictions
XSD Schema: Boudaries / data-type for xs:maxLength for simple data-type restrictions

Time:12-18

I am currently working with an xsd schema that I am not the author of / have control over. In the schema there is a data-type defined as a restricted string:

<xs:simpleType>
  <xs:restriction base="xs:string">
    <xs:minLength value="1" />
    <xs:maxLength value="9999999999999999" />
  </xs:restriction>
</xs:simpleType>

I get a run-time error in .net when working with this, that is caused by the maxLength attribute exeeding the range of a 32 bit integer. So now I am trying to figure out if this is actually a valid xsd schema.

I have tried searching for documentation regarding the data-type allowed for the xs:maxLength attribute, but have come up short.

CodePudding user response:

The official spec with https://www.w3.org/TR/xmlschema-2/#rf-maxLength says:

[Definition:] maxLength is the maximum number of units of length, where units of length varies depending on the type that is being ·derived· from. The value of maxLength ·must· be a nonNegativeInteger.

And for nonNegativeInteger it says in https://www.w3.org/TR/xmlschema-2/#nonNegativeInteger:

Definition:] nonNegativeInteger is ·derived· from integer by setting the value of ·minInclusive· to be 0. This results in the standard mathematical concept of the non-negative integers. The ·value space· of nonNegativeInteger is the infinite set {0,1,2,...}. The ·base type· of nonNegativeInteger is integer.

3.3.20.1 Lexical representation nonNegativeInteger has a lexical representation consisting of an optional sign followed by a finite-length sequence of decimal digits (#x30-#x39). If the sign is omitted, the positive sign (" ") is assumed. If the sign is present, it must be " " except for lexical forms denoting zero, which may be preceded by a positive (" ") or a negative ("-") sign. For example: 1, 0, 12678967543233, 100000.

Based on that I don't see any restriction on the value space, as an infinite set it allows any non negative integer.

Therefore I think the schema itself is valid.

That obviously doesn't help with the constraints any schema validator implementation will ultimately impose by choosing a data type in a programming language or platform to represent the value.

CodePudding user response:

XSD 1.1 specifies (in part 2, §5.4): All ·minimally conforming· processors must support decimal values whose absolute value can be expressed as i / 10k, where i and k are nonnegative integers such that i < 1016 and k ≤ 16 (i.e., those expressible with sixteen total digits).

So this 16-digit integer is valid.

It also says: For other infinite types such as string, hexBinary, and base64Binary, no minimum implementation limits are specified.

My interpretation of this is that the processor is allowed to impose a limit on the length of strings, and the spec does not prescribe any constraints on that length. In practice I suspect most implementations (especially those written 20 years ago) are unlikely to allow strings greater than 2^31 characters in length.

I would have thought it sensible that if the schema specifies a maxLength that is greater than the implementation-defined limit, it will be ignored; but there's nothing in the spec to explicitly say so.

The Microsoft processor implements XSD 1.0, which as far as I can see says nothing about limits at all. This is in the tradition of early W3C specifications: the XML spec, for example, says nothing about limits on the length of element and attribute names. In practice all processors are going to impose some kind of limit, and it would be hard to argue that that makes them non-conformant.

  • Related