There is a new avenue for participation in the ODF effort at OASIS: ODF Implementation, interoperability and conformance which I commend.
Conventionally, people speak of syntactical conformance and semantic conformance, where the first is easy and the second is hard. In fact, because computers can only deal in symbols, the second is impossible. So the issue for automated conformance testing becomes “how can we reflect the semantic operations into syntactical artifacts: into symbols we can investigate.”
So the semantic conformance problem then resolves into just another validation issue. And we have lots of nice schema languages notably Schematron which can help out there. (And using general purpose languages at a pinch, no worries!)
To put it another way, it is an issue of data capture.
For ODF, I would recommend they adopt a strategy of progressive but complete verification.
For ODF import and export, this is easy: have a good RELAX NG schema (make it quite forgiving), use NVDL and DSRL if needed, then use Schematron phases to allow various levels of validity to be detected. The trouble with the monolithic valid/invalid distinction is that there may easily be invalidities in thing you don’t care about. An implementation of a word processor may have problems in its support for spreadsheets, but it should be a minor issue not a flagged as a showstopper. Schematron’s phase mechanism groups patterns of assertions so that you can have a much more useful chunked view of the strengths and weaknesses of a system.
But this leaves the issue of screen display. How can that be tested? Given my characterization of the issue as being one of data capture, the answer is that ODF needs to specify a page dump format, which can then be tested with automated tests. What would this format look like? Think PDF in XML: tiny-SVG may be good enough—anything where you can get the page position of each character (or string) and graphic on a page.
For example, let us suppose we want to test a table implementation. Now we can use RELAX NG to say that there should be tables, rows, cells etc. And we can use Schematron to say that various numeric constraints should hold. And that gets us a long way into validating that good ODF is being generated and accepted. And we can have tests for whether bad ODF is accepted, and so on.
But what about the graphical component? Having a simple page object dump allows testing that, for example, if you have a string a and a string b in two adjacent cells of the same row in a table (in the same script and of the same metrics, etc), then the (X,Y) co-ordinates of their base points conform to (Xa < Xb) and (Ya ~= Yb)
And you can use Schematron for that kind of validation. The advantage of having this built into the spec is that then the ODF spec can use mathematic properties and constraints rather than just natural language. The disadvantage of this approach is that it imposes a burden on the implementer, in particular if the graphic library cannot be trapped conveniently to provide the information; however, it certainly should be possible to generate this information from the PDF (in a reverse of the Magellan software!) especially if using a nice PDF subset like PDF/A.