Enumerated value concepts for application building

This document is generated by an xslt script from the enumerations present in the following schema: "Unified Biosciences Information Framework (UBIF) 1.1". Enumerations are converted to SDD descriptive concepts (enumerated values are represented by concept states). The html report generated for these values / concept states is intended for documentation and to improve discussion and correction of errors (please comment on http://wiki.tdwg.org/twiki/bin/view/UBIF/EnumeratedValues). The xml representation follows the general conventions of UBIF documents and may be easier to import or integrate into user interfaces than the schema itself. This is especially true for SDD documents, where a large part of the terminology is provided in the form of data documents by the experts of a given knowledge domain. This document can be used side-by-side as schema-defined terminology with user-defined terminology.

NatLangPhraseRole

Enumerated value expressing the kind of phrase or "wording fragment" used to create natural language reports (especially object descriptions). These are currently highly constrained, but either additional values or free extensibility (by union of this type with xs:anyURI) are expected for future releases of UBIF.

States

Value	Abbrev.	Label	Description
Before	before	Phrase before contained elements; or single phrase	Free-form text that is being output in natural language reports before the natural language phrases for the children (if any). Descriptive terms with children are, e. g., modifiers, statistical measures, or characters/concepts; terms without children are, e. g., characters states and status values.
After	after	Phrase after contained elements	Free-form text output in natural language reports for objects with children (= contained objects) after the wording for the children. In the case of a character in an object description this is the wording after all states, or numerical values (including the measurement unit if present).
Delim	delim.	Default delimiter phrase between child obj.	Free-form text output in natural language reports between multiple child objects. Examples: ', ' (i. e. comma). Special delimiters may be defined for the delimiter in front of the last element and the case of exactly two child objects. Example: ', '.
LastDelim	delim.	Delimiter phrase between two last child obj.	Free-form text output in natural language reports before the last child object (i.e. between the second-but-last and the last). Examples: en-US: ', or ', de: ' oder ' (note comma and leading/trailing blanks!). If missing, the default delimiter is used. If defined, the special 2-obj.-delimiter is preferred over this for the case of only two child objects.
Exactly2	bw. 2	Delimiter phrase between exactly 2 child obj.	Free-form text output in natural language reports between children when there are exactly two children. If missing, the default 'delim' wording definitions will be used. Example: ' or ' in US-English.

(Return to table of contents)

CharacterTreeRole

Defines the intended roles that a designer may assign to a character tree (list of enumerated values to support application interoperability). Note: no values for designing the terminology are given; in the use cases all character trees are available.

States

Value	Label	Description
DescriptionEditing	For description editing	Setting this value in a character tree is a recommendation to applications with a user interface to offer this tree for editing the description data set (the application may, however, enable the user to select any character tree).
InteractiveIdentification	For interactive identification	Setting this value in a character tree is a recommendation to applications with a user interface to offer this tree for interactive identification.
TerminologyReporting	For terminology reporting	Setting this value in a character tree is a recommendation to applications to use this for creating a report of the character terminology. (Note that no TerminologyEditing value is defined; all character trees should be available when designing the terminology. However, the tree marked as TerminologyReporting may be used as the initial editing view.)
NaturalLanguageReporting	For natural language reporting	Setting this value in a character tree is a recommendation to applications to offer this tree for natural language reporting.
Filtering	For filtering	Setting this value in a character tree is a recommendation to applications to offer this tree for filtering purposes. Some trees are explicitly (separately) typed as being intended exclusively for filtering/subset definition; but many trees are useful for filtering purposes.

(Return to table of contents)

RatingContext

Defines the topic of a concept/character rating.

States

Value	Label	Description
ObservationConvenience	Convenience	How conveniently can be character be observed? This may includes a measure of cost of equipment and expendables (such as chemical reagents). Convenience should be rated relative to other methods required for identifications within a taxonomic group, i. e. if microscopic methods are always necessary in taxon group, microscopic characters may be considered convenient within this group. Also, a character may be convenient is one group, but inconvenient in another.
Availability	Availability	How available is the character or concept for identification? For example, ratings would be low if a character is available only during a short time in the life of an object, or only expressed with low frequency in populations.
Repeatability	Repeatability	How reliable and consistent are repeated measurements or scorings of the character by different observers and on different objects? This may include both variability of values (frequency of polymorphisms) and variability in how the observations are interpreted. It depends both on precision (quality of being reproducible) and accuracy (nearness to the true value).
CostEffectiveness	Cost-Effectiveness	How reliable and consistent are repeated scorings of the character by different observers and on different objects? This may include both variability of values (frequency of polymorphisms) and variability in how the observations are interpreted. It depends both on precision (quality of being reproducible) and accuracy (nearness to the true value).
PhylogeneticWeighting	Phylogenetic weighting	A weighting factor rating the relative weight of a character for the purpose of phylogenetic analysis.
RequiredExpertise	Required expertise	The user is expected to have this expertise level at least.

(Return to table of contents)

DataOrigin

Defines the origin of data that may have been entered, calculated, aggregated or inherited

States

Value	Label	Description
OriginalData	Original data, directly entered by a machine or human agent	These are the original data all other cached data (Origin other than 'OriginalData') are based upon.
Calculated	Calculated data, based on other data using a calculation rule	Examples: a ratio calculated from other characters, a mean calculated from a sample that is available under SampleData/Sample (if a mean is calculated from data no longer available, it would be recorded as 'OriginalData').
Mapped	Mapped data, based on other data using a mapping definition	Mapping examples are numeric to categorical, or from fine-grained categorical to coarse-grained categorical.
Aggregated	Aggregated data, derived from data further down in the hierarchy	This applies both to aggregating data from specimens or other individual to classes (taxa), as well as aggregating from lower classes/taxa to higher classes/taxa. (= e.g. 'Compile from below' in BioLink). In the case of descriptive summary data the relevant hierarchy is the taxon hierarchy.
Inherited	Inherited data, derived from data further up in the hierarchy	In the case of descriptive summary data the relevant hierarchy is the taxon hierarchy.

(Return to table of contents)

ModifierClass

Defines a subset of possible modifier classes. Used only on those modifiers that need to be typed to achieve application interoperability (especially when modifier specifications add a value-based interpretation for a modifier, like frequency or certainty values). More values may be added to this enumeration in the future.

States

Value	Label	Description
Frequency	Frequency modifer	Proportion values specify a frequency range.
Certainty	Certainty modifer	Proportion values specify a certainty range.
Seasonal	Seasonal modifer	Proportion values specify a season of the year. The proportion value 0 is interpreted as day 1, the proportion value 1 as day 365 of the year.
Diurnal	Diurnal modifer	This refers to parts of the day (24 h clock, i.e. including events that might strictly be called 'nocturnal'). Proportion values specify a time of the day. The proportion values 0 and 1 are both to be interpreted as midnight. Example: A modifier "in the morning" may be specified as '0.25-0.375'.
TreatAsMisinterpretation	Treat as misinterpretation	The current modifier becomes one of a special class of misinterpretation modifiers. States to which such modifiers are added are known to be intentionally wrongly scored to accomodate known misunderstandings of the character under study. Example: dogwood bracts looking like petals, and petal scored as 'white (by misinterpretation)'. - With regard to (not misinterpreted) data, both frequency and certainty may be interpreted as 0 to 0, i. e. not occurring, certainly false.
OtherModifierClass	Other modifier	All other modifers for which specifications are not yet defined. Examples are developmental, absolute and relative spatial modifiers, or modifiers of degree.

(Return to table of contents)

UnivarStatMeasureClass

When mapping numerical ranges to categorical states (as in a histogram), several methods which statistical measures are used for the mapping are possible. Using the central value compares a point with the mapping range, whereas using ranges or extremes results in a comparison of two kind of ranges for overlap. Only the central value method can guarantee an unambiguous partitioning into categories. However, the ranges or extremes methods may be desirable because of their improved error tolerance.

States

Value	Label	Description
CentralMeasure	Central measure	The first central measure encountered (mean, median, mode) is used as the basis of comparison. If none is found, but ranges or extremes are present, a central value is calculated based on the these.
Ranges	Ranges	Any ranges that are not the extremes (quantile, percentile, confidence interval, mean plus/minus s.d., etc.) is attempted to use for comparison. If none is found, Extreme values are used.
Extremes	Extremes	The extreme range values (= minimum and maximum) are used as the basis of comparison.

(Return to table of contents)

StateCollectionModel

Used in descriptive data (not in terminology): Collections of states in instance documents may be ordered (sequence) or unordered (set), and may be connected with 'and', 'or', 'with', or 'between'. Since set/sequence and operators are dependent on each other, the two aspects are combined into a 'model' enumeration

States

Value	Label	Description
OrSet	Unordered set of states, combined with 'or'	Multiple states scored for a character in a description form a set. The order of states has no special meaning and may be changed. In natural language output the states should be combined with 'or' to express that in individual objects (that belong to the class that is being described), the states may occur together or alone.
OrSeq	Ordered sequence of states, combined with 'or'	Multiple states scored for a character in a description form a sequence, i. e. the state order carries some semantics and should be preserved in output. The sequence semantics is not explicitly defined, but intelligable to human consumers and presumably relates to some concept of relevance or importance. In natural language output the states should be combined with 'or' to express that in individual objects (that belong to the class that is being described), the states may occur together or alone.
AndSet	Unordered set of states, of states, combined with 'and'	Multiple states scored for a character in a description form a set. The order of states has no special meaning and may be changed. In natural language output the states should be combined with 'and' to express that in any individual object (that belong to the class that is being described), the states will always occur together. Example: two colors that occur together in a pattern.
AndSeq	Ordered sequence of states, combined with 'and'	Multiple states scored for a character in a description form a sequence, i. e. the state order carries some semantics and should be preserved in output. The sequence semantics is not explicitly defined, but intelligable to human consumers and presumably relates to some concept of relevance or importance. In natural language output the states should be combined with 'and' to express that in any individual object (that belong to the class that is being described), the states will always occur together. Example: a black part with small red markings, is more appropriately described as 'black and red' than 'red and black'.
WithSeq	One state occurring together with others of secondary relevance.	This is a special case of AndSeq, and in many circumstances (except natural language generation) may be treated as AndSeg. Example: "Green with brown" (often this may be two characters, e. g. base color and dot color).
Between	True value lying between (usually two) states	Example: "Between oval and elliptic" = "Oval to elliptic".

(Return to table of contents)

MolecularSequenceType

Currently limited to 'Nucleotide' and 'Protein', but future SDD versions may expand this after appropriate discussion. A distinction between nucleotide type-subtypes RNA/DNA is currently not considered necessary; the symbols U (RNA) and T (DNA) should be considered equal for the purpose of analysis.

States

Value	Label		Description
Nucleotide	Nucleotide sequence		This includes both DNA and RNA sequences.
Protein	Protein sequence

(Return to table of contents)