This document is generated by an xslt script from the enumerations present in the following schema: "Unified Biosciences Information Framework (UBIF) 1.1". Enumerations are converted to SDD descriptive concepts (enumerated values are represented by concept states). The html report generated for these values / concept states is intended for documentation and to improve discussion and correction of errors (please comment on http://wiki.tdwg.org/twiki/bin/view/UBIF/EnumeratedValues). The xml representation follows the general conventions of UBIF documents and may be easier to import or integrate into user interfaces than the schema itself. This is especially true for SDD documents, where a large part of the terminology is provided in the form of data documents by the experts of a given knowledge domain. This document can be used side-by-side as schema-defined terminology with user-defined terminology.
Enumerated value expressing the kind of phrase or "wording fragment" used to create natural language reports (especially object descriptions). These are currently highly constrained, but either additional values or free extensibility (by union of this type with xs:anyURI) are expected for future releases of UBIF.
States
Value | Abbrev. | Label | Description | |
---|---|---|---|---|
Before | before | Phrase before contained elements; or single phrase | Free-form text that is being output in natural language reports before the natural language phrases for the children (if any). Descriptive terms with children are, e. g., modifiers, statistical measures, or characters/concepts; terms without children are, e. g., characters states and status values. | |
After | after | Phrase after contained elements | Free-form text output in natural language reports for objects with children (= contained objects) after the wording for the children. In the case of a character in an object description this is the wording after all states, or numerical values (including the measurement unit if present). | |
Delim | delim. | Default delimiter phrase between child obj. | Free-form text output in natural language reports between multiple child objects. Examples: ', ' (i. e. comma). Special delimiters may be defined for the delimiter in front of the last element and the case of exactly two child objects. Example: ', '. | |
LastDelim | delim. | Delimiter phrase between two last child obj. | Free-form text output in natural language reports before the last child object (i.e. between the second-but-last and the last). Examples: en-US: ', or ', de: ' oder ' (note comma and leading/trailing blanks!). If missing, the default delimiter is used. If defined, the special 2-obj.-delimiter is preferred over this for the case of only two child objects. | |
Exactly2 | bw. 2 | Delimiter phrase between exactly 2 child obj. | Free-form text output in natural language reports between children when there are exactly two children. If missing, the default 'delim' wording definitions will be used. Example: ' or ' in US-English. |
Defines the intended roles that a designer may assign to a character tree (list of enumerated values to support application interoperability). Note: no values for designing the terminology are given; in the use cases all character trees are available.
States
Value | Label | Description | |
---|---|---|---|
DescriptionEditing | For description editing | Setting this value in a character tree is a recommendation to applications with a user interface to offer this tree for editing the description data set (the application may, however, enable the user to select any character tree). | |
InteractiveIdentification | For interactive identification | Setting this value in a character tree is a recommendation to applications with a user interface to offer this tree for interactive identification. | |
TerminologyReporting | For terminology reporting | Setting this value in a character tree is a recommendation to applications to use this for creating a report of the character terminology. (Note that no TerminologyEditing value is defined; all character trees should be available when designing the terminology. However, the tree marked as TerminologyReporting may be used as the initial editing view.) | |
NaturalLanguageReporting | For natural language reporting | Setting this value in a character tree is a recommendation to applications to offer this tree for natural language reporting. | |
Filtering | For filtering | Setting this value in a character tree is a recommendation to applications to offer this tree for filtering purposes. Some trees are explicitly (separately) typed as being intended exclusively for filtering/subset definition; but many trees are useful for filtering purposes. |
Defines the topic of a concept/character rating.
States
Value | Label | Description | |
---|---|---|---|
ObservationConvenience | Convenience | How conveniently can be character be observed? This may includes a measure of cost of equipment and expendables (such as chemical reagents). Convenience should be rated relative to other methods required for identifications within a taxonomic group, i. e. if microscopic methods are always necessary in taxon group, microscopic characters may be considered convenient within this group. Also, a character may be convenient is one group, but inconvenient in another. | |
Availability | Availability | How available is the character or concept for identification? For example, ratings would be low if a character is available only during a short time in the life of an object, or only expressed with low frequency in populations. | |
Repeatability | Repeatability | How reliable and consistent are repeated measurements or scorings of the character by different observers and on different objects? This may include both variability of values (frequency of polymorphisms) and variability in how the observations are interpreted. It depends both on precision (quality of being reproducible) and accuracy (nearness to the true value). | |
CostEffectiveness | Cost-Effectiveness | How reliable and consistent are repeated scorings of the character by different observers and on different objects? This may include both variability of values (frequency of polymorphisms) and variability in how the observations are interpreted. It depends both on precision (quality of being reproducible) and accuracy (nearness to the true value). | |
PhylogeneticWeighting | Phylogenetic weighting | A weighting factor rating the relative weight of a character for the purpose of phylogenetic analysis. | |
RequiredExpertise | Required expertise | The user is expected to have this expertise level at least. |
Defines the origin of data that may have been entered, calculated, aggregated or inherited
States
Value | Label | Description | |
---|---|---|---|
OriginalData | Original data, directly entered by a machine or human agent | These are the original data all other cached data (Origin other than 'OriginalData') are based upon. | |
Calculated | Calculated data, based on other data using a calculation rule | Examples: a ratio calculated from other characters, a mean calculated from a sample that is available under SampleData/Sample (if a mean is calculated from data no longer available, it would be recorded as 'OriginalData'). | |
Mapped | Mapped data, based on other data using a mapping definition | Mapping examples are numeric to categorical, or from fine-grained categorical to coarse-grained categorical. | |
Aggregated | Aggregated data, derived from data further down in the hierarchy | This applies both to aggregating data from specimens or other individual to classes (taxa), as well as aggregating from lower classes/taxa to higher classes/taxa. (= e.g. 'Compile from below' in BioLink). In the case of descriptive summary data the relevant hierarchy is the taxon hierarchy. | |
Inherited | Inherited data, derived from data further up in the hierarchy | In the case of descriptive summary data the relevant hierarchy is the taxon hierarchy. |
Defines a subset of possible modifier classes. Used only on those modifiers that need to be typed to achieve application interoperability (especially when modifier specifications add a value-based interpretation for a modifier, like frequency or certainty values). More values may be added to this enumeration in the future.
States
Value | Label | Description | |
---|---|---|---|
Frequency | Frequency modifer | Proportion values specify a frequency range. | |
Certainty | Certainty modifer | Proportion values specify a certainty range. | |
Seasonal | Seasonal modifer | Proportion values specify a season of the year. The proportion value 0 is interpreted as day 1, the proportion value 1 as day 365 of the year. | |
Diurnal | Diurnal modifer | This refers to parts of the day (24 h clock, i.e. including events that might strictly be called 'nocturnal'). Proportion values specify a time of the day. The proportion values 0 and 1 are both to be interpreted as midnight. Example: A modifier "in the morning" may be specified as '0.25-0.375'. | |
TreatAsMisinterpretation | Treat as misinterpretation | The current modifier becomes one of a special class of misinterpretation modifiers. States to which such modifiers are added are known to be intentionally wrongly scored to accomodate known misunderstandings of the character under study. Example: dogwood bracts looking like petals, and petal scored as 'white (by misinterpretation)'. - With regard to (not misinterpreted) data, both frequency and certainty may be interpreted as 0 to 0, i. e. not occurring, certainly false. | |
OtherModifierClass | Other modifier | All other modifers for which specifications are not yet defined. Examples are developmental, absolute and relative spatial modifiers, or modifiers of degree. |
When mapping numerical ranges to categorical states (as in a histogram), several methods which statistical measures are used for the mapping are possible. Using the central value compares a point with the mapping range, whereas using ranges or extremes results in a comparison of two kind of ranges for overlap. Only the central value method can guarantee an unambiguous partitioning into categories. However, the ranges or extremes methods may be desirable because of their improved error tolerance.
States
Value | Label | Description | |
---|---|---|---|
CentralMeasure | Central measure | The first central measure encountered (mean, median, mode) is used as the basis of comparison. If none is found, but ranges or extremes are present, a central value is calculated based on the these. | |
Ranges | Ranges | Any ranges that are not the extremes (quantile, percentile, confidence interval, mean plus/minus s.d., etc.) is attempted to use for comparison. If none is found, Extreme values are used. | |
Extremes | Extremes | The extreme range values (= minimum and maximum) are used as the basis of comparison. |
Used in descriptive data (not in terminology): Collections of states in instance documents may be ordered (sequence) or unordered (set), and may be connected with 'and', 'or', 'with', or 'between'. Since set/sequence and operators are dependent on each other, the two aspects are combined into a 'model' enumeration
States
Value | Label | Description | |
---|---|---|---|
OrSet | Unordered set of states, combined with 'or' | Multiple states scored for a character in a description form a set. The order of states has no special meaning and may be changed. In natural language output the states should be combined with 'or' to express that in individual objects (that belong to the class that is being described), the states may occur together or alone. | |
OrSeq | Ordered sequence of states, combined with 'or' | Multiple states scored for a character in a description form a sequence, i. e. the state order carries some semantics and should be preserved in output. The sequence semantics is not explicitly defined, but intelligable to human consumers and presumably relates to some concept of relevance or importance. In natural language output the states should be combined with 'or' to express that in individual objects (that belong to the class that is being described), the states may occur together or alone. | |
AndSet | Unordered set of states, of states, combined with 'and' | Multiple states scored for a character in a description form a set. The order of states has no special meaning and may be changed. In natural language output the states should be combined with 'and' to express that in any individual object (that belong to the class that is being described), the states will always occur together. Example: two colors that occur together in a pattern. | |
AndSeq | Ordered sequence of states, combined with 'and' | Multiple states scored for a character in a description form a sequence, i. e. the state order carries some semantics and should be preserved in output. The sequence semantics is not explicitly defined, but intelligable to human consumers and presumably relates to some concept of relevance or importance. In natural language output the states should be combined with 'and' to express that in any individual object (that belong to the class that is being described), the states will always occur together. Example: a black part with small red markings, is more appropriately described as 'black and red' than 'red and black'. | |
WithSeq | One state occurring together with others of secondary relevance. | This is a special case of AndSeq, and in many circumstances (except natural language generation) may be treated as AndSeg. Example: "Green with brown" (often this may be two characters, e. g. base color and dot color). | |
Between | True value lying between (usually two) states | Example: "Between oval and elliptic" = "Oval to elliptic". |
Currently limited to 'Nucleotide' and 'Protein', but future SDD versions may expand this after appropriate discussion. A distinction between nucleotide type-subtypes RNA/DNA is currently not considered necessary; the symbols U (RNA) and T (DNA) should be considered equal for the purpose of analysis.
States
Value | Label | Description | |
---|---|---|---|
Nucleotide | Nucleotide sequence | This includes both DNA and RNA sequences. | |
Protein | Protein sequence |