Skip to main content

What"s the best way to validate an XML file against an XSD file?


I'm generating xml files that need to conform to an xsd that was given to me. What's the best way to do this?



Source: Tips4allCCNA FINAL EXAM

Comments

  1. The Java runtime library supports validation. Last time I checked this was the Apache Xerces parser under the covers. You should probably use a javax.xml.validation.Validator.

    import javax.xml.XMLConstants;
    import javax.xml.transform.Source;
    import javax.xml.transform.stream.StreamSource;
    import javax.xml.validation.*;
    ...

    URL schemaFile = new URL("http://java.sun.com/xml/ns/j2ee/web-app_2_4.xsd");
    Source xmlFile = new StreamSource(new File("web.xml"));
    SchemaFactory schemaFactory = SchemaFactory
    .newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
    Schema schema = schemaFactory.newSchema(schemaFile);
    Validator validator = schema.newValidator();
    try {
    validator.validate(xmlFile);
    System.out.println(xmlFile.getSystemId() + " is valid");
    } catch (SAXException e) {
    System.out.println(xmlFile.getSystemId() + " is NOT valid");
    System.out.println("Reason: " + e.getLocalizedMessage());
    }


    The schema factory constant is the string http://www.w3.org/2001/XMLSchema which defines XSDs. The above code validates a WAR deployment descriptor against the URL http://java.sun.com/xml/ns/j2ee/web-app_2_4.xsd but you could just as easily validate against a local file.

    You should not use the DOMParser to validate a document (unless your goal is to create a document object model anyway). This will start creating DOM objects as it parses the document - wasteful if you aren't going to use them.

    ReplyDelete
  2. If you are generating XML files programatically, you may want to look at the XMLBeans library. Using a command line tool, XMLBeans will automatically generate and package up a set of Java objects based on an XSD. You can then use these objects to build an XML document based on this schema.

    It has built-in support for schema validation, and can convert Java objects to an XML document and vice-versa.

    Castor and JAXB are other Java libraries that serve a similar purpose to XMLBeans.

    ReplyDelete
  3. I found this site to be helpful, too.

    http://www.ibm.com/developerworks/xml/library/x-javaxmlvalidapi.html

    It's the one that actually worked for me with a minimum of fuss.

    ReplyDelete
  4. We build our project using ant, so we can use the schemavalidate task to check our config files:

    <schemavalidate>
    <fileset dir="${configdir}" includes="**/*.xml" />
    </schemavalidate>


    Now naughty config files will fail our build!

    http://ant.apache.org/manual/Tasks/schemavalidate.html

    ReplyDelete
  5. Are you looking for a tool or a library?

    As far as libraries goes, pretty much the de-facto standard is Xerces2 which has both C++ and Java versions.

    Be fore warned though, it is a heavy weight solution. But then again, validating XML against XSD files is a rather heavy weight problem.

    As for a tool to do this for you, XMLFox seems to be a decent freeware solution, but not having used it personally I can't say for sure.

    ReplyDelete
  6. I had to validate an XML against XSD just one time, so I tried XMLFox. I found it to be very confusing and weird. The help instructions didn't seem to match the interface.

    I ended up using LiquidXML Studio 2008 (v6) which was much easier to use and more immediately familiar (the UI is very similar to Visual Basic 2008 Express, which I use frequently). The drawback: the validation capability is not in the free version, so I had to use the 30 day trial.

    ReplyDelete
  7. One more answer: since you said you need to validate files you are generating (writing), you might want to validate content while you are writing, instead of first writing, then reading back for validation. You can probably do that with JDK API for Xml validation, if you use SAX-based writer: if so, just link in validator by calling 'Validator.validate(source, result)', where source comes from your writer, and result is where output needs to go.

    Alternatively if you use Stax for writing content (or a library that uses or can use stax), Woodstox http://woodstox.codehaus.org can also directly support validation when using XMLStreamWriter. Here's a blog entry showing how that is done:

    ReplyDelete
  8. If you have a Linux-Machine you could use the free command-line tool SAXCount. I found this very usefull.

    SAXCount -f -s -n my.xml


    It validates against dtd and xsd.
    5s for a 50MB file.

    In debian squeeze it is located in the package "libxerces-c-samples".

    The definition of the dtd and xsd has to be in the xml! You can't config them separately.

    ReplyDelete

Post a Comment

Popular posts from this blog

[韓日関係] 首相含む大幅な内閣改造の可能性…早ければ来月10日ごろ=韓国

div not scrolling properly with slimScroll plugin

I am using the slimScroll plugin for jQuery by Piotr Rochala Which is a great plugin for nice scrollbars on most browsers but I am stuck because I am using it for a chat box and whenever the user appends new text to the boxit does scroll using the .scrollTop() method however the plugin's scrollbar doesnt scroll with it and when the user wants to look though the chat history it will start scrolling from near the top. I have made a quick demo of my situation http://jsfiddle.net/DY9CT/2/ Does anyone know how to solve this problem?

Why does this javascript based printing cause Safari to refresh the page?

The page I am working on has a javascript function executed to print parts of the page. For some reason, printing in Safari, causes the window to somehow update. I say somehow, because it does not really refresh as in reload the page, but rather it starts the "rendering" of the page from start, i.e. scroll to top, flash animations start from 0, and so forth. The effect is reproduced by this fiddle: http://jsfiddle.net/fYmnB/ Clicking the print button and finishing or cancelling a print in Safari causes the screen to "go white" for a sec, which in my real website manifests itself as something "like" a reload. While running print button with, let's say, Firefox, just opens and closes the print dialogue without affecting the fiddle page in any way. Is there something with my way of calling the browsers print method that causes this, or how can it be explained - and preferably, avoided? P.S.: On my real site the same occurs with Chrome. In the ex