How to Validate an ODF Document

There are four steps to validating an ODF document:

  1. Extacting the XML from the ODF contain file
  2. Determining what version of ODF your document uses
  3. Retrieving the schemas associated with that version of ODF
  4. Executing the validation tool

We take each one of these topics in turn.

Extracting the XML

The first thing to note is that the typical ODF file, with an odt, ods or odp extension is not a pure XML file. It is a container file, in ZIP format, containing several XML files, along with associated binary images and other resources. So first you need to extract the XML from the container file. The method to do this will vary according to your operating system and tools, but a typical way is to rename the file to have a .zip extension and then to unzip it using your default zip utility. In most cases you will end up with the following XML files:

Checking the Version Attribute

Next you need to determine what version of ODF the document uses. This can be found by inspecting the office:version attribute in the root element any of the XML files. Expected values are "1.0" or "1.1".

A quick survey of common ODF authoring applications indicates the following defaults:

Retrieving the Schemas

You can retrieve the schemas from theODF TC's homepage. You will see 3 schema files listed for each published ODF version:

  1. The manifest schema, for validating the manifest.xml file
  2. The ODF schema, for validating the other XML files
  3. A "strict" version ODF schema, as described in Appendix A of the ODF standard

For most uses you will want to download the manifest schema and the ODF schema.

Running the Validation

This step will vary according to what validation tool you are using. We'll give a few examples using some common validation tools. We'll also explain some known bugs and workarounds.

Jing

Jing is a Relax NG validator written by James Clark, co-author of the Relax NG standard. If you download it and install according to instructions on his web site, you can validate an ODF XML file with a command line like this:

java -jar c:/jing/bin/jing.jar -i OpenDocument-schema-v1.0-os.rng content.xml 

Note in particular the use of the "-i" command line flag for jing. This is necessary in order to disable the ID/IDREF checking from the Relax NG DTD Compatibility specification that jing enforces by default (See the "ODF Validation for Dummies" page, which explains why this is needed and why it's okay from a specification viewpoint).

Frequently Asked Questions

Question: I get this error message when trying to validate all ODF documents what is wrong? "conflicting ID-types for attribute "targetElement" from namespace "urn:oasis:names:tc:opendocument:xmlns:smil-compatible:1.0" of element "command" from namespace "urn:oasis:names:tc:opendocument:xmlns:animation:1.0"

Answer: This error is seen when validating with jing, when failing to use the "-i" flag to disable the Relax NG DTD Compatibility checking. Try again, with the -i command line flag.


Answer: It sounds like you have an ODF 1.1 document, but are trying to validate it against the ODF 1.0 schema. Check the office:version attribute on the document's root element. If it says "1.1" then you need to be using the ODF 1.1 schema.

References

How to Validate an OASIS OpenDocument file (KOffice wiki)

ODF Validation for Dummies


How_to_Validate_an_ODF_document (last edited 2009-08-12 18:04:23 by localhost)