Defining Document Zones

Document zones can be defined in the file style.xml. Inside this file, the following elements are available as instructions for defining how XML tags (i. e. attributes in contents) are to be handled:

The following command ignores all XML tags in the document, indexing only the content of the XML elements:

<ignore xmltag = "*"/>

The following instruction skips indexing the specified xmltag but indexes the content between its start and end tags of the specified xmltag :

<ignore xmltag = "section_1"/>

The following instruction indexes the XML element identified by xmltag as a zone if there is also an ignore xmltag="*" instruction:

<preserve xmltag = "section_1"/>

The following instruction suppresses the entire element identified by xmltag. The tag, attributes, and content are not indexed:

<suppress xmltag = "section_1"/>

The following instruction indexes the content between the start and end tags of the specified xmltag as a field which is given the fieldname identified by fieldname. If fieldname is not specified, the tag name is used as field name. Any existing value of the field is overridden if the optional attribute index="override" has been specified.

<field xmltag="column_2" fieldname="vdk_field_2" index="override"/>

The elements to be indexed as zones can be defined inclusively or exclusively. When defined exclusively, all elements are indexed except the ones whose name has been specified using <ignore xmltag="..."/>. To define zones inclusively, <ignore xmltag="*"/> is used to exclude all elements first. Then the elements to be indexed are included explicitly by using <preserve xmltag="..."/> for each of them.

With both methods, inclusive and exclusive, the contents of the elements (i. e. the zones) can be stored in fields. This makes it possible to return these values in the search result for each selected document. It is not possible to ignore an element and to store its content in a document field at the same time.