xmlHelpline Blog
Xml, Xslt, data standards, and anything else...

XSD Design: 1 schema included via 2 different @schemaLocation paths

I've been getting a pretty steady flow of requests for the Xml Schema Lightener and Schema Flattener. Thus far, all good reviews. (Of course a user may decide its junk and say nothing. It is after all a free utility.) Nevertheless feedback has been great.

Until now.
I was recently notified of a bug in the Flattener and I've set about trying to fix it. In researching the issue, it raised an interesting question. I've been working with Xml Schema since about 2001 and so new issues don't come up often.

Here is the problem. If you have 2 xsd:includes or xsd:imports and they have DIFFERENT @schemaLocation values but point to the SAME schema file, the thing won't work. But why this was a bug made me think. How are non-duplicate values of @schemaLocation determined to be duplicate includes? Is my XSLT wrong or is the XSD created incorrectly?

Some research of interest here, here, here, and here.

Take this simplified physical file structure of xsd includes the person with the error gave me:

/ folderA / common.xsd
/ folderB / folderBB / schema1.xsd ...(includes common.xsd)
/ folderC / schema2.xsd .......... (includes common.xsd)
/ folderD / schema3.xsd ......... (includes schema1.xsd and schema2.xsd)

Now look at their schemaLocation include file paths:

/ folderB / folderBB / schema1.xsd
.......... xsd:include schemaLocation="../ ../ folderA / common.xsd"

/ folderC / schema2.xsd
.......... xsd:include schemaLocation="../ folderA / common.xsd"

/ folderD / schema3.xsd
.......... xsd:include schemaLocation="../ folderB / folderBB / schema1.xsd"
.......... xsd:include schemaLocation="../ folderC / schema2.xsd"

The issues comes down to paths and nested includes. In schema3.xsd, by way of nesting schemas, you get 2 physical paths to the same common.xsd file:

..........xsd:include schemaLocation="../ ../ folderA / common.xsd"
..........xsd:include schemaLocation="../ folderA / common.xsd"

Of course I knew that the Xml Schema spec says to toss out duplicate includes. But how does Xml Schema determine duplicates (and avoid name collisions)? Clearly it isn't by schemLocation uniqueness, which was my error in the Flattener. The schemaLocation is a URI, but the "U" doesn't stand for "Unique". It stands for "Uniform", the meaning of which is less clear to me.

The answer is that the relative path URI needs to be resolved to a full path before comparing paths and determining duplicates. If the full physical path is created, then the locations are indeed the same.
No cutting corners and relying on schemaLocation uniqueness as given.
© Copyright Paul Kiel.