CODA XML Mapping Description

CODA provides access to XML by creating a view on the XML files using the CODA data types. Below we will describe how CODA maps the XML product structure to one that is based on the CODA data types.

XML products are partial self describing products. When an XML file is opened the structure of a product can be retrieved from the file itself. However, in order to properly interpret the ASCII data within an XML element (i.e. is it an integer, real, string, etc.?) an external definition is required. With an external definition one can also prescribe the occurrence of each xml element.

A common way of providing such external definitions is by means of an XML Schema file. However, CODA does not rely on XML Schema files for the XML definitions but uses its own description mechansim for XML files. The main reason behind this choice is that the regular CODA product format descriptions (for raw ascii/binary files) can then be reused to describe the ASCII content of XML leaf elements (allowing for non-standard time or boolean formats, multi-dimensional arrays of numbers, etc.).

When there is no description provided for an XML file, then CODA will still be able to open the file. CODA will parse the whole file and build up a dynamic definition of the file. In this definition all XML leaf elements will be interpreted as plain text (i.e. string types).

XML elements

Each XML element in an XML file is mapped to a record field. The name of the record field is based on the XML element name. These names can sometimes differ because fieldnames have to be formatted as an identifier in CODA (i.e. no spaces and special characters may be used). The approach of turning an element name into an identifier is to convert all characters that or not alphanumerical characters (0-9, a-z, A-Z) to an underscore.

When an XML element can occur more than once within its parent element then the field will be an array. Arrays of XML elements are always one dimensional and their size is always dynamic.

XML element content

XML elements can contain only other elements (in which case it is mapped to a record), contain pure text data (in which case its content is described using CODA ascii types), or mixed content (as in XHTML). Mixed content is not supported by CODA and opening files with such content will result in an error.

Root of the product

The root of an XML product is always mapped to a record containing a single field. This single field will correspond with the top-level xml element of the file.

Attributes

CODA will provide an attribute record for each XML element and it contains all attributes for the element as they are stored in the XML file.