Chemical Markup Language

Chemical Markup Language

Namespace ContentsSchema Source
 http://www.w3.org/2001/XMLSchema http://www.w3.org/2001/XMLSchema.xsd
 http://www.w3.org/XML/1998/namespace http://www.w3.org/2001/xml.xsd
 http://www.xml-cml.org/schema/cml2/core http://www.xml-cml.org/dtdschema/cmlCore.xsd
 http://www.xml-cml.org/schema/stmml http://www.xml-cml.org/dtdschema/STMML.xsd

Chemical Markup Language Namespaces
http://www.w3.org/2001/XMLSchema
Part 1 version: Id: structures.xsd,v 1.2 2004/01/15 11:34:25 ht Exp
Part 2 version: Id: datatypes.xsd,v 1.3 2004/01/23 18:11:13 ht Exp

See also: http://www.w3.org/TR/2004/PER-xmlschema-1-20040318/structures.html
The schema corresponding to this document is normative,
with respect to the syntactic constraints it expresses in the
XML Schema language.  The documentation (within <documentation> elements)
below, is not normative, but rather highlights important aspects of
the W3C Recommendation of which this is a part

The simpleType element and all of its members are defined
   towards the end of this schema document

simple type for the value of the 'namespace' attr of
'any' and 'anyAttribute'

Value is
           ##any      - - any non-conflicting WFXML/attribute at all
           ##other    - - any non-conflicting WFXML/attribute from
                           namespace other than targetNS
           ##local    - - any unqualified non-conflicting WFXML/attribute
           one or     - - any non-conflicting WFXML/attribute from
           more URI        the listed namespaces
           references
           (space separated)
 ##targetNamespace or ##local may appear in the above list, to
     refer to the targetNamespace of the enclosing
     schema or an absent targetNamespace respectively

notations for use within XML Schema schemas

First the built-in primitive datatypes.  These definitions are for
information only, the real built-in definitions are magic.

For each built-in datatype in this schema (both primitive and
derived) can be uniquely addressed via a URI constructed
as follows:
  1) the base URI is the URI of the XML Schema namespace
  2) the fragment identifier is the name of the datatype
For example, to address the int datatype, the URI is:
  http://www.w3.org/2001/XMLSchema#int
Additionally, each facet definition element can be uniquely
addressed via a URI constructed as follows:
  1) the base URI is the URI of the XML Schema namespace
  2) the fragment identifier is the name of the facet
For example, to address the maxInclusive facet, the URI is:
  http://www.w3.org/2001/XMLSchema#maxInclusive
Additionally, each facet usage in a built-in datatype definition
can be uniquely addressed via a URI constructed as follows:
  1) the base URI is the URI of the XML Schema namespace
  2) the fragment identifier is the name of the datatype, followed
     by a period (".") followed by the name of the facet
For example, to address the usage of the maxInclusive facet in
the definition of int, the URI is:
  http://www.w3.org/2001/XMLSchema#int.maxInclusive

Now the derived primitive types
http://www.w3.org/XML/1998/namespace
See http://www.w3.org/XML/1998/namespace.html and
http://www.w3.org/TR/REC-xml for information about this namespace.
 This schema document describes the XML namespace, in a form
 suitable for import by other schema documents.
 Note that local names in this namespace are intended to be defined
 only by the World Wide Web Consortium or its subgroups.  The
 following names are currently defined in this namespace and should
 not be used with conflicting semantics by any Working Group,
 specification, or document instance:
 base (as an attribute name): denotes an attribute whose value
      provides a URI to be used as the base for interpreting any
      relative URIs in the scope of the element on which it
      appears; its value is inherited.  This name is reserved
      by virtue of its definition in the XML Base specification.
 id   (as an attribute name): denotes an attribute whose value
      should be interpreted as if declared to be of type ID.
      The xml:id specification is not yet a W3C Recommendation,
      but this attribute is included here to facilitate experimentation
      with the mechanisms it proposes.  Note that it is _not_ included
      in the specialAttrs attribute group.
 lang (as an attribute name): denotes an attribute whose value
      is a language code for the natural language of the content of
      any element; its value is inherited.  This name is reserved
      by virtue of its definition in the XML specification.
 space (as an attribute name): denotes an attribute whose
      value is a keyword indicating what whitespace processing
      discipline is intended for the content of the element; its
      value is inherited.  This name is reserved by virtue of its
      definition in the XML specification.
 Father (in any context at all): denotes Jon Bosak, the chair of
      the original XML Working Group.  This name is reserved by
      the following decision of the W3C XML Plenary and
      XML Coordination groups:
          In appreciation for his vision, leadership and dedication
          the W3C XML Plenary on this 10th day of February, 2000
          reserves for Jon Bosak in perpetuity the XML name
          xml:Father

This schema defines attributes and an attribute group
        suitable for use by
        schemas wishing to allow xml:base, xml:lang, xml:space or xml:id
        attributes on elements they define.
        To enable this, such a schema must import this schema
        for the XML namespace, e.g. as follows:
        <schema . . .>
         . . .
         <import namespace="http://www.w3.org/XML/1998/namespace"
                    schemaLocation="http://www.w3.org/2001/xml.xsd"/>
        Subsequently, qualified reference to any of the attributes
        or the group defined below will have the desired effect, e.g.
        <type . . .>
         . . .
         <attributeGroup ref="xml:specialAttrs"/>
         will define a type which will schema-validate an instance
         element with any of those attributes

In keeping with the XML Schema WG's standard versioning
   policy, this schema document will persist at
   http://www.w3.org/2005/08/xml.xsd.
   At the date of issue it can also be found at
   http://www.w3.org/2001/xml.xsd.
   The schema document at that URI may however change in the future,
   in order to remain compatible with the latest version of XML Schema
   itself, or with the XML namespace itself.  In other words, if the XML
   Schema or XML namespaces change, the version of this document at
   http://www.w3.org/2001/xml.xsd will change
   accordingly; the version at
   http://www.w3.org/2005/08/xml.xsd will not change.
http://www.xml-cml.org/schema/cml2/core
WARNING.

This document has been automatically generated from the XSD Schema, using XSLT stylesheets. Schemas are complex and it is not easy to produce the "best" view. It is possible that some information is included twice and (possibly) some is omitted. The Schema itself should always be taken as definitive


Curation.
  • Created by hand starting from output of dtd2xsd. editing to enhance datatypes and content models, etc. 2001-09-21
  • First draft 2001-10-02
  • Next draft 2002-09-20 (sic)
  • Next draft 2002-11-20 (CML 2.0, on website)
  • Submitted to JCICS 2002-12-01
  • changed #REQUIRED to #IMPLIED on various id attributes 2002-12-21
  • added property and propertyList 2002-12-29
  • added identifier Element 2002-12-29
  • added name Element 2002-12-29
  • revision submitted to JCICS 2003-01-31
  • role attribute added to molecule and substanceList
  • number attribute added to symmetry
  • final revision submitted to JCICS 2003-03-31 (CML V2.1)
  • bugfix (fixed '-' in pattern) 2003-07-10 (CML V2.1.1)
  • bugfix (removed appinfo child of element@ref) 2003-07-10 (CML V2.1.1)

This schema represents a fundamental core for future CML. Some of the earlier elements may be obsolete, and some will be moved into new CML schemaspaces. The vocabulary is essentially unaltered but the syntax is simpler and the validation is more powerful.

CML2.1 is the reference release for the JCICS publication and can be used with confidence that it will not be altered (other than essential bugfixes and addition documentation). Further versions will proceed via the CML2.2 branch, and are primarily driven by the need to support the extended CML family of schemas.


XSL validation.

There is a prototypic validation procedure based on XSLT stylesheets with namespace prefix val. The syntax is XSL. The only example occurs in bond at present. Some global val resources will be defined in this section.

  <xsd:appinfo>
    <val:key names="atoms" match="atom" use="@id"/>
    <val:key names="bonds" match="bond" use="@id"/>
    <val:key names="molecules" match="molecule" use="@id"/>
    <val:template name="error">
      <val:param name="error"/>
      <val:message>XSLT validation error: <val:value-of select="$error"/></val:message>
      <val:element name="error">
XSLT validation error: <val:value-of select="$error"/>
      </val:element>
    </val:template>
  </xsd:appinfo>
    
http://www.xml-cml.org/schema/stmml
WARNING

This document has been automatically generated from the XSD Schema, using XSLT stylesheets. Schemas are complex and it is not easy to produce the "best" view. It is possible that some information is included twice and (possibly) some is omitted. The Schema itself should always be taken as definitive


Curation
  • created by hand 2001-11-20
  • First draft 2001-11-20
  • Revised documentation 2002-11-20 (sic)

STMML supports domain-independent STM information components


Data Types and Data Structure

Overview

STMML defines a number of data types suited to STM. It also defines a number of complex data strucures such as arrays, matrices and tables. the constraints are sometimes created through elements and sometimes through attributes. We classify the general components as follows:

Abstract Data Structures

  • scalar. A scalar quantity, expressible as a string, but with many optional facets such as errors, units, ranges, etc. Most elements may have countType attribute to indicate more than one instance.
  • array. An array of homogeneous scalars whose size is described by sizeType. Delimiters in string representations can ve varied.
  • matrix. A rectangular (often square) matrix of homogeneous scalars. Many matrices have special functions (see matrixType) such as geometric transformations
  • table. An table where the columns are homogeneous arrays.
  • list. A list of heterogeneous components from any namespace.
  • sizeType. Size of arrays
  • delimiterType. A lexical delimiter

Links and References

  • link. Support for simple hyperlinks and link structures
  • refType. A reference to an element
  • namespaceRefType. A reference to an element, including namespace-like prefixes

Data-based simpleTypes

Common attribute types

  • idType. Specifies lexical patterns for IDs
  • id. ID attribute (highly encouraged)
  • title. Title attribute (highly encouraged)
  • conv. Convention attribute

General components

STMML provides a very small number of abstract elements to capture frequently encountered concepts in STM documents. There are no predetermined semantics or ontology; it is expected that descriptive metadata will be added through dictionaries.

All elements can contain any element children and can carry the common STM attributes. Currently there are the following:

  • object. Almost anything - concrete, abstract, representable by a noun. Objects can have properties added through scalar, etc.
  • action. Represents an action performed during a scientific narrative. It has attributes describing a time-line and conditions so that a procedure could be replayed. It has a container actionList which shares these attributes and which can describe sets of actions.
  • observation. Contains narrative or other elements describing an observation, planned or unplanned

Dictionary components

Dictionaries are a major part of STMML and supported as follows:

The dictionary itself:

  • dictionary. This element defines a dictionary and is often the root element (though a data instance might also be combined with a dictionary). The dictionary play a similar role to a simple schema, by defining data types and other constraints (such as enumerations). By transforming a dictionary to schema format, schema-based tools can be used for validation. A dictionary is normally composed of entrys.
  • entry. An entry contanins information which describes or constrains elements in a data instance. The link is made through a dictRef attribute on the data element. Descriptive information can apply to any type of element (not necessarily part of or derived from the STM Schema). Constraints are similar to those in XML Schemas and use the same vocabulary (dataTypes, value ranges, enumerations, patterns, etc.). They normally apply to elements from the STM Schema or derived from it.

    In addition entrys can constrain elements to have the same higher-level structures and constraints defined by STM Schema. Thus entrys can require a data element to be a matrix, of a given type, with fixed number os rows and columns. These constraints are usually attributes on the entry element, which therefore maps directly onto the instance. Every entry has a mandatory term attribute which is the formal text string representing the concept. This string can contain any allowed XML characters (e.g. greek characters) but not markup (e.g. MathML or CML).
  • definition. An almost mandatory child element of entry, giving a formal definition of the term
  • description. Additional descriptive informati>on for an entry. This can contain any content, often HTML, but also MathML, CML for description of equations, chemical formulae, etc.
  • alternative. Alternative strings for describing the concept. These can be any of the stnadard lexical and terminological data categories such as synonyms, abbreviations, homonyms, etc. (see ISO12620 for a full range).
  • enumeration. A list of allowed values for the data element (or elements in arrays, matrices).
  • relatedEntry. A related entry. Sometimes this is descriptive (e.g. "seeAlso" provides additional information on related concepts). It can also be used for constraints, and there is a small controlled vocabulary of relationships, but no universal syntax. We support parentage (e.g. through "partitiveParent" = "partOf"). In principle this can be used with appinfo to provide algorithmically constructed relationships.
  • attributes. A wide range of constraints is provided through attributes, several being similar to facets on XML Schema datatypes:
    • rows and columns, the structure of the data element.
    • recommendedUnits, units and unitType, the units of the data element.
    • minExclusive, minInclusive, maxExclusive and maxInclusive, the value of the data element.
    • totalDigits, fractionDigits, length, maxLength, minLength and pattern. The lexical form of the data element.
  • annotation. Similar to XML Schema, this has children documentation for information about the entry (normally curatorial) and appinfo to describe entries and constraints in machine-processable fashion. .


Chemical Markup Language


Stylus Studio XML Schema Library Home