Rules for keyref matching text

This page is intended for working out rules around DITA 1.3 proposal 13079, which should clarify how matching text is determined based on the use of the keyref attribute. The text in 1.2 is unclear, treats elements as overly general groups, and in some cases is illogical / leads to invalid results.

Background:

Suggested updates

The definitions below would replace the two general groupings today (elements with @href, and elements without @href). Unless otherwise specified in the exception list following this list, all OASIS specializations of an element follow the rules of the ancestor element. The rules below match the rules in DITA 1.2,, EXCEPT where strict interpretation of the DITA 1.2 rules is illogical (such as for navref and image). Cases where the rules break with or extend DITA 1.2 are noted.

Note: when text is specified locally, such as <term keyref="something">Local Term</term> or <xref keyref="something">Local Link Text</xref>, that is used. The topic on processing key references may benefit from making this clearer (currently it is mentioned once , as elements that refer to that key and that are empty may get their effective content...). Note that topicref is an exception here; it already has its own section below the more general rules at the top of the topic. Topicref is an exception because matching content is merged with existing topicmeta content.

Another general clarification: the specification currently states that linktext may be used as fallback for matching content, but that Elements within <linktext> that do not match the content model of the key reference directly or after generalization should be skipped. As a result, if "this is <b>important</b>" is pulled into a context that does not allow elements, the result will be "this is ". I suggest that in 1.3 we update this to say that the text equivalent of the element may be used, which would result in "this is important" for such text-only uses.

In each case below, matching something "in the key defining element" actually means "in the topicmeta child of the key defining element".

TBD Items

These items still need to be considered (are ignored, unresolved, or not yet addressed in the write-up below):

Elements with @href

The spec currently groups the elements "author, data, data-about, image, link, lq, navref, publisher, source, topicref, xref, and their specializations)" into a common set of rules for elements that specify @href. The spec gives two somewhat conflicting sets of instructions. First at the start:

Followed by:

And finally:

To take an example of <author>, I believe the first instruction would result in a search for <author> in the key defining element. The second instruction would result in all elements from within the key definition that are valid inside of <author>. Thus they seem to be at odds. Others in this list are illogical (navref is empty so should not have any matching content; image contains <alt> which will result in some contortions to find a valid matching element).

I suggest the following updates:

Element

Updates

topicref

Topicref is unusual in that "matching element content" covers the topicmeta element and all children. The 1.2 spec already declares:

Content from a key reference element and a key-defining element is combined following the rules for combining metadata between maps and other maps and between maps and topics. The @lockmeta attribute is honored when metadata content is combined.

The rules for combining metadata are at: http://docs.oasis-open.org/dita/v1.2/os/spec/archSpec/reconciling-topic-and-map-metadata.html

As an update for DITA 1.3, we should link to that topic. We should also explicitly clarify how to treat elements that are not relevant for topic/map cascading:
navtitle: if specified on the referencing element, use that; otherwise, if specified on the key defining element, use that; otherwise, follow the normal process for determining a navigation title.
linktext: if specified on the referencing element, use that; otherwise, if specified on the key defining element, use that; otherwise, follow the normal process for determining link text.
shortdesc: if specified on the referencing element, use that; otherwise, if specified on the key defining element, use that; otherwise, follow the normal process for determining a short description.

navref

The navref element is empty, so there is no matching element content. This is a change from DITA 1.2, which erroneously grouped navref with "elements that have @href" (it does not currently have @href).

longdescref, longquoteref

These elements have @href, but were not explicitly called out in DITA 1.2. They are both empty, so there is no matching element content.

link

For link text: if specified locally, use that; otherwise if specified in the key defining element, use that; otherwise processors should determine link text using normal rules for determining link text (such as pulling from navtitle or directly from the referenced topic).

For short description: if <desc> specified locally, use that; otherwise if <shortdesc> specified in the key defining element, use that; otherwise processors should determine the short description using normal rules for links.

xref

For link text: if text is specified locally, use that; otherwise if specified as <linktext> in the key defining element, use that; otherwise processors should determine link text using normal rules for determining link text (such as pulling from navtitle or directly from the referenced topic).

For short description: if <desc> specified locally, use that; otherwise if <shortdesc> specified in the key defining element, use that; otherwise processors should determine the short description using normal rules for xref.

image

The current wording is really not relevant for image, as it contains <alt> or <longdescref> (which can have its own keyref). I suggest a complete revision for image: If <alt> is specified locally, use that; otherwise, if <linktext> is specified in the key defining element, that becomes the alternative text; otherwise no alternative text is used. (Note: not sure if or how to generate the longdescref child, which has no content but could specify another href).

data, data-about, lq, author, publisher, source

For each element "<q>" in the first column: When content is specified inside the element, use that; otherwise, if a <q> element exists inside the key defining element's topicmeta, use the first one; otherwise, if <linktext> is specified inside the key defining element, use that; otherwise, processors may fall back to other defaults for determining link text of a target.

Note also the typos in the current topic: "in valid context" should be "valid in context"; elements that "specify an @href attribute" should I believe actually say elements that "define an @href attribute" (that is, the rules apply here to every <xref> whether or not a specific <xref> actually specifies @href).

Elements that do not define @href

Current wording begins with the same text shown above:

Followed by:

Interesting to note: results may vary below depending on whether filtering is done before or after the matching text resolution. When text is taken from 'the first keyword', filtering may remove the first; it could also remove all keywords, leaving behind linktext.

Element

Updates

keyword, term

Based on the spec today: If content specified locally, use that; otherwise, text is taken from the first keyword or term found in <keywords> in the key defining element; otherwise, may be taken from linktext in the key defining element.

Question: If <keywords> inside they key "goofy" is specified as <keywords><keyword>One</keyword><term>Two</term></keywords> -- I believe that the specification today says that both <keyword keyref="goofy"/> and <term keyref="goofy"/> would have to take content from the first keyword, meaning the matching text is "One". Calling this out because I think this has been a point of confusion. Should term look first for a <term> element? If order is reversed, should <keyword> look first for <keyword>?

dt, ph, indexterm, index-base, cite

If content specified locally, use that; otherwise, text is taken from the first keyword or term found in <keywords> in the key defining element; otherwise, may be taken from linktext in the key defining element.

indexterm

Same as previous: If content specified locally, use that; otherwise, text is taken from the first keyword or term found in <keywords> in the key defining element; otherwise, may be taken from linktext in the key defining element.

Question: The others look to a keyword or term inside of <keywords>. Given that keywords also specifies <indexterm>, should <indexterm> with keyref be allowed to pull from that?

indextermref

The indextermref element is specified as "not completely defined, and reserved for future use". So, we should treat indextermref/@keyref the same way.

Exceptions:

abbreviated-form

This specialization of term already has clear rules in 1.2 for how to resolve keyref values: http://docs.oasis-open.org/dita/v1.2/os/spec/langref/abbreviated-form.html

KeyrefMatchingText (last edited 2012-09-18 15:56:49 by robander)