UNDER CONSTRUCTION

LATEST WORKING DRAFT IS in SVN: http://tools.oasis-open.org/version-control/svn/xliff/trunk/xliff-20/

XLIFF Inline Markup DRAFT (==> this wiki page is superseded by the document in SVN. See above)

Overview

The XLIFF inline markup is used to represent two types of data in an XLIFF content:

Codes

A code is part of content representing non-linguistic data embedded within translatable content. For example:

A code can come from one of two origins:

A code can take one of two forms of placement:

In some formats, spans may overlap.

The content of an extraction unit can be split into smaller parts, some translatable (segments) some not translatable (ignorable). For example:

Segment one. Segment two. Segment three.

can be split into:

[Segment one. ][Segment two.][ ][Segment three.]

Given that such splitting is arbitrary and usually independent of the codes the content may hold, the codes of a span form can end up in different parts of the content.

This leads to the need for three possible type of markers:

Summary Table of the Representations:

Span Type

Placeholder Type

Standalone Start Representation

Standalone End Representation

Paired Representation

Standalone Representation

Original Codes

Original code with native notation

<sc id='1'>data</sc>

<ec id='2' rid='1'>data</ec>

not possible

<ph id='1'>data</ph>

Original code without native notation

<sc id='1'/>

<ec id='2'/>

<pc id='1'>text</pc>

<ph id='1'/>

Added Codes

Added code with native notation

<sc id='3'>data</sc>

<ec id='4' rid='3'>data</ec>

not possible

<ph id='2'>data</ph>

Added code without native notation, but a link to the original code

<sc id='3' rel='1'/>

<ec id='4' rid='3' rel='2'/>

<pc id='2' rel='1'>text</pc>

<ph id='2' rel='1'/>

Added code without native data, but a standard type

<sc id='3' type='bold'/>

<ec id='4' rid='3' type='bold'/>

<ph id='2' type='bold'>text</pc>

<ph id='2' type='lb'/>

Editing Hint

A code can have several edting hints:

Common Attributes

Span Type

Placeholder Type

Standalone Start Representation (<sc>)

Standalone End Representation (<ec>)

Paired Representation (<pc>)

Standalone Representation (<ph>)

Display-friendly representation of an inline code for informational purpose

disp?

disp?

disp? and dispEnd?

disp?

Text-equivalent representation of an inline code for linguistic processes

equiv?

equiv?

equiv? and equivEnd?

equiv?

Editing hints

ed?

ed?

ed?

ed?

Identifier of the code.

id

-

id

id

Reference to closing code

-

rid

-

-

Type of native code represented

type?

type?

type?

type?

Reference to the code being replicated

rel?

rel?

rel?

rel?

Reference to the element where the native data is stored

nid?

nid?

nid?

nid?

Annotations

TODO

Text that cannot be represented directly in XML

Characters not allowed in XML are represented with the cp element:

<cp hex="001A"/><cp hex="4"/>

OneContentModel/Draft (last edited 2012-02-27 15:54:55 by ysavourel)