Click to See Complete Forum and Search --> : [RESOLVED] xml not valid against XSD: regular expression


Krc
11-28-2008, 02:31 AM
Dear Members,

When I validate the XML file

(here you can see a little part of it)
<account>
<CustomerNumber>4999</CustomerNumber>
<VAT>BE0463855575</VAT>
<Name>AFFRESH</Name>
<Street>ZEEL</Street>
<HouseNumber>1</HouseNumber>
<ZipCode>5800</ZipCode>
<Locality>HAL</Locality>
<Activity>Food</Activity>
</account>

I receive an error saying the HouseNumber doesn't match the pattern: "^[A-Z,a-z,0-9,/,-., ]{0,10}$"

little part of the XSD:
<xs:simpleType name="houseNumberType">
<xs:restriction base="xs:string">
<xs:pattern value="^[A-Z,a-z,0-9,/,-., ]{0,10}$" />
</xs:restriction>
</xs:simpleType>


Does somebody know how I can make my xml valid?

Thanks!

rpgfan3233
11-28-2008, 03:56 AM
The metacharacters ^ and $ don't exist in XML Schema's regex syntax. From Appendix D of the XML Schema Part 0: Primer Second Edition (http://www.w3.org/TR/xmlschema-0/#regexAppendix) (long title, I know):
XML Schema's pattern facet uses a regular expression language that supports Unicode. It is fully described in XML Schema Part 2. The language is similar to the regular expression language used in the Perl Programming language, although expressions are matched against entire lexical representations rather than user-scoped lexical representations such as line and paragraph. For this reason, the expression language does not contain the metacharacters ^ and $, although ^ is used to express exception, e.g. [^0-9]x.

Just remove the ^ and $ characters, and you'll be fine. Also, you might want to check for preceding/following whitespace, e.g. \s*[A-Za-z\d/\-\.]{0,10}\s* in houseNumberType, since a value might span multiple lines. Also remember to escape your `.' and `-' characters since `.' represents any character and `-' represents a range of values like in the expression "A-Z". I also used "\d" rather than "0-9" in the example I gave. It's simply easier to type, and it does the same thing.

Krc
11-28-2008, 04:00 AM
Thank you very much!

Krc
12-05-2008, 01:13 AM
Hi,

\A and \Z are they also not allowed in regular expressions in XSD's?

Regards,
Katja

rpgfan3233
12-05-2008, 06:41 AM
Hi,

\A and \Z are they also not allowed in regular expressions in XSD's?

Regards,
Katja

No, they are not. XML Schema's regex syntax is rather different from that of a programming language like Perl, but only because of the environment in which it is used. You might like a site that I just found myself because I was looking for something to refer to instead of the XML Schema recommendation (that can be confusing, but at least they define everything extremely well in their grammar). It's basically a rather handy reference located at http://www.xmlschemareference.com/regularExpression.html. One thing to note: there is a shorthand for character references (e.g. &#33; or &#x21; for '!') in regular expressions - you are allowed to omit the '&' and the ';', but both must be omitted. If you think that might be confusing (i.e. [#x21-#x2F;] to match '!' to '/' as well as ';'), I recommend just using the full character references. Since XML Schema is XML, the full character references would be replaced by the XML parser with their appropriate characters upon parsing anyway.

Krc
12-05-2008, 06:55 AM
Thank you for the link.