UBIF_TypeLib.xsd schema file overview

(Version: Unified Biosciences Information Framework (UBIF) 1.1)

TDWG working group: Structure of Descriptive Data (SDD)

Introduction

This document gives an overview of the schema components present in a single schema file, similar to the entry view provided by graphical schema editors. It documents only the root level annotations and components (elements, global attributes, simple and complex types, and groups). The definition of the components listed here is documented separately (hyperlinking could not yet be implemented).

Because the UBIF schema is designed as a type library, complex types represent class definitions and most schema files contain only a single root-level element.

Please see the schema documentation resource directory for schema overviews of other files and detailed component documentation.


Schema file content

The following content is generated automatically from the documentation inside the schema file:

Unified Biosciences Information Framework (UBIF) XML schema. This part provides a type library of fundamental simple and complex types. See the main UBIF.xsd file for complete information, copyright and licensing.

Copyright © 2006 TDWG (Taxonomic Databases Working Group, www.tdwg.org). See the file UBIF_(c).xsd for authorship and licensing information.

Note: if multiple namespaces shall be used, all of which make use of this library, it would be possible to remove both xmlns="http://rs.tdwg.org/UBIF/2006/" and targetNamespace="http://rs.tdwg.org/UBIF/2006/" from xs:schema. In this 'chameleon pattern' (http://www-106.ibm.com/developerworks/library/x-flexschema/ or http://www.xfront.com/ZeroOneOrManyNamespaces.html), the included type libraries acquire the target namespace of the including schema. However, when testing this pattern in 2003-2004, several validators had problems handling this; for the time being UBIF and related schemata like SDD use only a single namespace.


Imported or included schemata:

The following import of xml namespace allows use of xml:lang directly. That schema defines an attribute lang of type="xs:language". The enumerated language values of this type are extensible using "x-" plus identifier. For the case of language-neutral elements (scientific taxon names) the value 'x-neutral' is recommended. To express unknown or mixed language, the special values 'mul' (multiple/mixed) and 'und' (undetermined/unknown) already exist (see http://wiki.tdwg.org/twiki/bin/view/UBIF/ExtendLanguageWithNeutralAndUnknown). Note: the import uses a local schema version, to ensure validation at times when not connected to the internet. The original schemaLocation is "http://www.w3.org/2001/xml.xsd".

Imports: w3c-schema/xml.xsd (http://www.w3.org/XML/1998/namespace)

Includes: UBIF_EnumLib.xsd


Basic type library:

Basic generic types:

Derived string types with restricting patterns:

The resource media type carries currently only semantics, no syntax or regular expression pattern:


The following Range, Date, and Coordinate types describe frequently recurring simple type combinations in a element with attributes

Elements defining value ranges:

Types for composite gregorian calendar date/time (points in time where parts may be missing; following the seven property model described, e. g., in xml Schema 1.1 (http://www.w3.org/TR/2004/WD-xmlschema11-2-20040716/#theSevenPropertyModel). Instead of gYear, gMonth, gDay integer types with constraining facets are used for two reasons: a) each of them may have a timezone, which may lead to inconsistent data with multiple timezones; b) the lexical representation seems to be occasionally poorly implemented (e.g. where '31', or '---5' are accepted, whereas valid examples are '---31', '---05', and '---05+02:00'). In addition to the seven property model additional text attributes for either unsharp additions or complete verbatim dates are added. Note that incomplete dates in most cases are calendar specific and incomplete non-gregorian dates can not be expressed. Furthermore, for complete dates it may be unclear whether a reformed or unreformed date has been used (e.g. in Russia in the 19th century).

Types for geographical coordinates:


Complex types closely related to enumerations (these may alternative be placed in UBIF_TypeLib)

Complex types referring to UnivarStatMeasureEnum (used, e. g., by SDD):


Other complex types


Base type and derived types for all document internal cross reference (using id/ref attributes):


Language and audience attributes form the basis of text representations of labels and other types:

Note: the use of attribute groups instead of globally defined and referred attributes is a work-around for namespace problems occurring with attribute definitions in included library schemata.

Audience is also available as an object type to define label and expertise level for audiences. However, audience values may be used even if no Audience object with a corresponding id can be found.)

The reason for this is that all object labels, representations may already use audience in addition to language. To avoid circular dependencies or introducing special cases for audience objects, it was considered acceptable not to validate the correspondence using schema identity constraints (= referential integrity) here.

(Note: If audience definitions are present, a missing attribute (and one explicitly containing the default set in this schema, e.g. "-") in multilingual or AudienceRef should be treated as pointing to the first audience with expertiselevel=0 (undefined).


Complex types that add language/audience or 'preferred' attributes to the simple types LongString, ShortString, anyURI:


(Generated on 23. May 2006 by DiversitySchemaTools Version 0.5. Copyright (c) G. Hagedorn 2006.)