SDD_TypeLib.xsd schema file overview

(Version: Unified Biosciences Information Framework (UBIF) 1.1 and SDD 1.1)

TDWG working group: Structure of Descriptive Data (SDD)

Introduction

This document gives an overview of the schema components present in a single schema file, similar to the entry view provided by graphical schema editors. It documents only the root level annotations and components (elements, global attributes, simple and complex types, and groups). The definition of the components listed here is documented separately (hyperlinking could not yet be implemented).

Because the UBIF schema is designed as a type library, complex types represent class definitions and most schema files contain only a single root-level element.

Please see the schema documentation resource directory for schema overviews of other files and detailed component documentation.


Schema file content

The following content is generated automatically from the documentation inside the schema file:

This file will be included into the UBIF/SDD integration schema 'SDD.xsd' (SDD uses the same namespace as UBIF).

Copyright © 2006 TDWG (Taxonomic Databases Working Group, www.tdwg.org). See the file SDD_(c).xsd for authorship and licensing information.

Due to problems with key/keyrefs when using two namespaces (see documentation on the SDD WIKI: http://wiki.tdwg.org/twiki/bin/view/SDD/UBIFDesignRequirements), the SDD schema is based on the UBIF namespace, and thus uses include rather than import!

Includes: UBIF_CoreExtensions.xsd

Includes: SDD_EnumLib.xsd


UBIF insertion groups

The two SDD-groups are used inside the UBIF top-level Datasets/Dataset structure to define the object collections used by SDD


For all first-class objects in SDD, collections of type set are defined. These form root-level collections in the Dataset object.


TERMINOLOGY START

DescriptiveConcepts, Characters and dependent objects (states, modifiers, statistical measures)

1. a) DescriptiveConcept definitions. Note: relations between concepts may be defined in the operational character tree. Independent ontologies of concepts may be created through Link rel=Subclass etc. Another plan for the future is to allow defining concepts relations inside characters.

Inner classes of DescriptiveConcept. ModifierSeq, ConceptStateSeq, and RecommendedMeasureSeq are second-class objects embedded in first class objects.

1. b) Character tree definitions, references (plus internal types)

Inner classes of CharacterTree and CharTree_Node:

2. --- Character definitions (characters = data recording and analysis variables, depending on observed part, property, and observation or measurement methodology)

a) Abstract base type and derived types to be used in instance documents.

Note: The ColorRangeCharacter above is only an example of other derivations expected, like algorithmically described shapes, molecular sequences (genome/proteome), or molecular patterns (RFLP, AFLP, etc)

b) inner classes, one-time use within character definitions above

c) State definitions within CategoricalCharacter. Abstract base type and derived types to be used in instance documents.

d) Character and state references

e) Modifiers cover expressions of certainty, frequency, manner, degree, etc. that can be added to existing character value or state data in descriptions.

Modifier reference (single, and group with multiple) to be used in coded descriptions:

Modifier reference extended with Text element, used in natural language markup:

(Note on ModifierRef/ModifierRefMarkup: Although semantics for the lower/upper attributes are defined only for frequency and certainty modifiers, the schema allows are them in all statement modifications. Additional validation by other means than xml schema may be provided, and applications should use the lower/upper attributes only in modifiers of types than Certainty and Frequency. In other modifier types, the values may be discarded upon import. XML schema validation was attempted in SDD up to 1.0 beta 2, but this resulted in a complex system of multiple derived base types and was considered too complicated.

f) Statistical measures: The base semantics and labels are already available through UBIF. At concepts node further elaboration may occur: a) wording and value formatting b) definition of recommended measure sets.


TERMINOLOGY END


TERMINOLOGY-BASED DATA

The following types are used in descriptions or identification key to code descriptive data by reference to characters, states, and modifiers defined in the Terminology.

3. --- Character references in coded descriptions: SummaryData

a) abstract and non-abstract derived types used in coded descriptions

Note: The non-abstract derived types are to be used in instance documents. The type names have been shortened to simplify instance documents, especially if an xsi:type would be used (Char xsi:type='CatSummaryData').

b) types used inside the CharSummaryData-derived types

c) A collection of summary character data, containing a choice of derived character data types (polymorphic structure, choice options are equivalent to use of base type plus xsi:type).

4. --- Character references in coded descriptions: SampleData

a) abstract and non-abstract derived types used in sample data

5. --- Character references in coded descriptions: SampleData

a) abstract and non-abstract derived types used in natural language descriptions. Lacking multiple inheritance mechanisms in xml schema, these Markup versions have been derived independently. They are designed to be closely related to corresponding types in the coded description, however.

("ColorRangeMarkup" (color polygon measurement data) or "SequenceMarkup" (molecular or other sequences) are not supported at the moment, since the author do not expect to find them in natural language descriptions. If necessary, these types will be added.)

b) The following NLD type refers to concept nodes and has no corresponding types in SummaryData/SampleData:

c) types used inside the CharacterMarkup types


TERMINOLOGY-BASED DATA END


DESCRIPTIONS START

Descriptions are either natural language with optional markup or coded descriptions. Both are derived from the same base type:

A special subtype of CodedDescription are original sampling data, which are organized into referrable SamplingEvent containers:


DESCRIPTIONS END


IDENTIFICATION KEYS START

Stored identification keys (esp. manually designed as opposed to automatically generated) are stored in a separate section:


IDENTIFICATION KEYS END


Other basic types used by SDD (compare also the types used by UBIF)

Character rating (equivalent to DELTA wheight, reliability, etc., but characters are scored taxon specific in descriptions rather than for all taxa)

Special types for natural language wordings:


(Generated on 23. May 2006 by DiversitySchemaTools Version 0.5. Copyright (c) G. Hagedorn 2006.)