ODF 1.2 Document Processing Model
Dennis E. Hamilton
Reflect in the ODF 1.2 specification enough of a document model, its annotation and decoration, and the manipulation and processing of it that the semantics for an ODF 1.2 document structure can be expressed. Define the processing models and related decorations and annotations of the document model that are directed to particular processing cases, such as document creation, visual presentation, final-form media presentation, interactive use, document manipulation, and interactive manipulation.
These Approach Notes will be removed as the proposal is expanded and completed.
- There is a document model (sometimes thought of as a document information model but for ODF the model has structure and levels of abstraction, as between an abstract text stream, a frame, its imaging on a page, etc.).
- The value of the document model is that is supports coherent integration and explanation of the features of ODF document structures and their relationship to the document as perceived and subject to creation and modification as a digital entity. The following notes are illustrative of that value and are thought-experiments toward arriving at a document processing model for ODF that is servicable enough for shared understanding and consistent explanation in the specification.
- The document model is not to be confused with a Document Object Model (DOM) that is generally offered as an operational abstraction of a document structure (e.g., of an XML document or an HTML document), not of the semantics of a document structure.
- The document model is a conceptual notion that a processor's behavior would be consistent with the existence of, but there is no requirement that the document model be manifest in some observable way other than through the consequences of ODF document semantics being realized in the behavior of a processor. Although processors are typically developed to have internal representations that could be viewed as private models organized for processing, the document model, as it is conceived here, does not speak to such use of internal representations and other engineering decisiosn by the designers of a processor. Although the document processing model might suggest approaches for internal models, there is no such requirement and the document model is not intended to be sufficient for an implementation, not even a reference implementation.
- The document model is an explanatory device and it might be dealt with informally in the narrative of the ODF specification. It is nevertheless important, even if not fully specified, for understanding of the intended interpretation of the ODF document structure, especially the XML elements and attributes that figure prominently in representing documents of a particular type.
- The document model also supports the identification of categories of processing that are recognized by the ODF specification and to which features of ODF documents may be related (or explicitly not related).
- The other way that the document model bridges the ODF document structure to the useful processing of the ODF document is by being decorated and annotated with processing-relevant information. These decorations and annotations, and the modeled entities themselves, are all derived through interpretation of the XML elements and attributes of the ODF document structure, along with those supporting components appealed to throught the XML elements and attributes. In various processing scenarios, these decorations and annotations are also subject to manipulation, as is the document-model entity with which they are associated. One can consider that the ultimate default source of decorations is the ODF 1.2 specification itself. There may also be default decorations effectively provided by a given processor implementation.
- Examples of document-model entities: tables, images, text (frames), formulas, fields, paragraphs, sections, etc. Individual instances of these entities might or might not uniquelly associated with specific XML elements in the document structure.
- The classes of model entities form domains: There is a domain of tables, for example.
Part of the processing (and representation) decoration of domains consist of identifiers for members of a given domain. For example, the domain of tables has a domain of table names. Continuing the example, the <table:table> element corresponds to a table document entity. The value of the attribute table:name is a unique string for an identifier in the table-name domain. The table corresponding to that table-name value is unique even if in all other respects it is indistinguishable from the table associated with a different table-name value.
The importance of the table-name domain is that its members constitute unique identifiers for tables but there is no relationship to identifier domains for other classes of document-entity. That is, the table:name attribute value of a <table:table> element must only be unique in the table-name domain.
Note that other occurrences of table:name as attributes of elements other than <table:table> mayh or may not have their values be in the table-name domain. When they do, they are usually used to identify the table having that table-name and introduced by a <table:table> element elsewhere in the document structure. This use of domains help sharpen this distinction between uses of attributes having the same name and the same value with different elements. (One could have systematically distinguished between table:name and table:nameref or something, but that wsn't done.)
- A form of processing decoration and annotation with regard to the table-name domain has to do with conditions of the following that might or might not apply in the case of the table-name domain:
Whether the table:name attribute values can be arbitrarily chosen by a processor when generating a table and when reprsenting the table via <table:table> element and other material in the document structure.
Whether table:name attribute values must be preserved when a document is manipulated but the corresponding <table:table> element table remains in the document structure, or whether table-name associations can be changed arbitrarily so long as the correct unique identification is maintained.
- Whether table-name values must be human readable.
- Whether table-name attributes can be chosen arbitrarily by agents interacting with a processor and those agents are permitted to change the table-name associated with a table
- Whether internationalization and localization considerations apply to the expression of table-name values via attribute-value strings and their presentation at an interface where the name may be observed and "read" in the language of its origination.
- This is simply meant to indicate how the use of a document processing model can help us to cover the bases and have a way to see what the bases are. (I am not suggesting that all of these case apply to table names, and if some of them did, it would be better to use a different property for the displayable/choosable identification of a table.)
- Another kind of processing-related decoration is the association of parameters that provide recommendations for default table cell decorations when a new row or column is introduced into a table.
- Construction of a document model, document entities, and document entity domains is an useful way to reconcile the homonym cases of attributes and their values in ODF 1.2. The association of decorations and annotations (special processing-model properties if you like) is valuable in accounting for the advice/requirements that may be incorporated in a document that are intended to influence its modification under appropriate conditions (and with a capable processor).
- I propose this sort of approach, even if articulated in a working agreement within the ODF-TC and not formally documented, as a valuable tool for identifying questions that need to be answered in the ODF specification and then ensuring that the specification does indeed provide answers to those questions.
No need for elaborate formal use cases. But give a few sentences that explain what this feature is and does from the perspective of the consumer of this feature. The consumer might be the end user of the application. It might be an application developer. It might be an archivist. It could be almost anyone. But please give a few words on why this feature should interest us.
There is often more than one way to accomplish a task. For your use case, what other alternatives did you consider, and why did you find them inadequate?
Requested changes to the ODF Standard
Text changes/additions (please state section numbers):
Enter specific text edits here
Enter specific Relax NG schema edits here
Does this proposal add any mandatory features or behaviors to ODF documents or applications?
If we make this change, what will be the impact on existing ODF processors?
Will this feature require review by the Accessibility Subcommittee? For example, does it add structure to the document that can be navigated, store a user-viewable string or associate a label with content?
Workflow (to be filled in by TC Chairs)
Date Proposal initially made:
Dates Proposal discussed on TC calls:
Date vote is requested:
Date vote is held:
Results of vote: