Home > OS >  XSD that prohibits strings from starting or ending with whitespace?
XSD that prohibits strings from starting or ending with whitespace?

Time:11-19

We are using XSD schema validation before loading the XML file . So while loading the XML with XSD schema validation it is accepting the blank space at beginning and end of the string. We need to restrict the blank space at beginning of the string and end of the string only . But middle of the string we need to accept the blank space between the strings.

Example : SOMEXMLFIELD ="STACK OVER FLOW" .

For this we had configure the below XSD pattern value for schema validation.

Example : <xs:pattern value="^[A-Za-z0-9 _.,']*[A-Za-z0-9_.,'] [A-Za-z0-9 _.,']*$"/>

Can anyone please suggest how can I restrict the blank space at beginning and end of the string only ( Note : We need to accept the blank space between strings ) ? .

CodePudding user response:

This XSD,

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
  elementFormDefault="qualified">
  <xs:element name="e">
    <xs:simpleType>
      <xs:restriction base="xs:string">
        <xs:pattern value="[^\s]?"/>
        <xs:pattern value="[^\s].*[^\s]"/>
      </xs:restriction>
    </xs:simpleType>
  </xs:element>
</xs:schema>

will allow e to have interior whitespace but not begin or end with whitespace.

Note that XSD regex patterns are implicitly anchored, so do not use ^ and $ for anchoring.

Reference:

Note: Unlike some popular regular expression languages (including those defined by Perl and standard Unix utilities), the regular expression language defined here implicitly anchors all regular expressions at the head and tail, as the most common use of regular expressions in ·pattern· is to match entire literals.


Update: @Thefourthbird had a good point that 1 and 2 character strings without leading/trailing whitespace probably ought to be allowed. I threw in the empty string too.

CodePudding user response:

If you also want to allow matching 1 and 2 characters, you can start the match not allowing the space, and optionally match any chars including the space in the middle and end with matching the characters without the space again.

[A-Za-z0-9_.,']([A-Za-z0-9 _.,']*[A-Za-z0-9_.,'])?

Regex demo

Or in short:

[\w.,']([\w .,']*[\w.,'])?
  • Related