DCMI Abstract Model
Creator: |
Andy Powell Eduserv Foundation, UK |
---|---|
Creator: |
Mikael Nilsson KMR Group, CID, NADA, KTH (Royal Institute of Technology), Sweden |
Creator: |
Ambjörn Naeve KMR Group, CID, NADA, KTH (Royal Institute of Technology), Sweden |
Creator: |
Pete Johnston Eduserv Foundation, UK |
Creator: |
Thomas Baker DCMI |
Date Issued: | 2007-06-04 |
Identifier: | http://dublincore.org/specifications/dublin-core/abstract-model/2007-06-04/ |
Replaces: | http://dublincore.org/specifications/dublin-core/abstract-model/2005-03-07/ |
Replaces: | http://dublincore.org/specifications/dublin-core/abstract-model/2007-04-02/ |
Is Replaced By: | Not applicable |
Latest Version: | http://dublincore.org/specifications/dublin-core/abstract-model/ |
Status of Document: | This is a DCMI Recommendation |
Description of Document: | This document describes an abstract model for Dublin Core™ metadata. |
Table of contents
- Introduction
- DCMI Abstract Model
- Descriptions, description sets and records
- Values
- DCMI Abstract Model semantics
- Encoding guidelines
- Terminology
Appendix A - Relationship to legacy DCMI Grammatical Principles
References
Acknowledgements
1. Introduction
This document specifies an abstract model for Dublin Core™ metadata. The primary purpose of this document is to specify the components and constructs used in Dublin Core™ metadata. It defines the nature of the components used and describes how those components are combined to create information structures. It provides an information model which is independent of any particular encoding syntax. Such an information model allows us to gain a better understanding of the kinds of descriptions that we are encoding and facilitates the development of better mappings and cross-syntax translations.
This document is primarily aimed at the developers of software applications that support Dublin Core™ metadata, people involved in developing new syntax encoding guidelines for Dublin Core™ metadata and people developing metadata application profiles based on DCMI vocabularies or on other compatible vocabularies.
The DCMI Abstract Model builds on work undertaken by the World Wide Web Consortium (W3C) on the Resource Description Framework (RDF) [RDF, RDFS]. The use of concepts from RDF is summarized below in Section 5.
The DCMI Abstract Model is represented here using UML class diagrams [UML]. Readers that are not familiar with UML class diagrams should note that lines ending in a block-arrow should be read as 'is' or 'is a' (for example, "a value is a resource") and that lines starting with a block-diamond should be read as 'contains a' or 'has a' (for example, "a statement contains a property URI"). Other relationships are labeled appropriately. In this document, words and phrases in italics are defined in Section 7, Terminology.
2. DCMI Abstract Model
2.1 The DCMI Resource Model
The abstract model of the resources described by descriptions is as follows:
-
Each described resource is described using one or more property-value pairs.
-
Each property-value pair is made up of one property and one value.
-
Each value is a resource - the physical, digital or conceptual entity or literal that is associated with a property when a property-value pair is used to describe a resource. Therefore, each value is either a literal value or a non-literal value:
-
A literal value is a value which is a literal.
-
A non-literal value is a value which is a physical, digital or conceptual entity.
-
-
A literal is an entity which uses a Unicode string as a lexical form, together with an optional language tag or datatype, to denote a resource (i.e. "literal" as defined by RDF [RDF]).
2.2 The DCMI Description Set Model
The abstract model of DC metadata description sets is as follows:
-
A description set is a set of one or more descriptions, each of which describes a single resource.
-
A description is made up of one or more statements (about one, and only one, resource) and zero or one described resource URI (a URI that identifies the described resource).
-
Each statement instantiates a property-value pair, and is made up of a property URI (a URI that identifies a property) and a value surrogate.
-
A value surrogate is either a literal value surrogate or a non-literal value surrogate:
-
A literal value surrogate is a value surrogate for a literal value, and is made up of exactly one value string. The value string is a literal which encodes the literal value.
-
A non-literal value surrogate is a value surrogate for a non-literal value, and is made up of zero or one value URI (a URI that identifies the non-literal value associated with the property), zero or one vocabulary encoding scheme URI (a URI that identifies the vocabulary encoding scheme of which the non-literal value is a member), and zero or more value strings. Each value string is a literal which represents the non-literal value.
-
-
A value string is either a plain value string or a typed value string
-
A plain value string may have an associated value string language that is an ISO language tag (for example en-GB). Plain value strings are intended to be human-readable.
-
A typed value string has an associated syntax encoding scheme URI that identifies a syntax encoding scheme.
-
2.3 The DCMI Vocabulary Model
The abstract model of the vocabularies used in DC metadata descriptions is as follows:
-
A vocabulary is a set of one or more terms. Each term is a member of one or more vocabularies.
-
A term is a property (element), class, vocabulary encoding scheme, or syntax encoding scheme.
-
Each property may be related to one or more classes by a has domain relationship. Where it is stated that a property has such a relationship with a class and the property is part of a property/value pair, it follows that the described resource is an instance of that class.
-
Each property may be related to one or more classes by a has range relationship. Where it is stated that a property has such a relationship with a class and the property is part of a property/value pair, it follows that the value is an instance of that class.
-
Each resource may be an instance of one or more classes.
-
Each resource may be a member of one or more vocabulary encoding schemes.
-
Each class may be related to one or more other classes by a sub-class of relationship (where the two classes are defined such that all resources that are instances of the sub-class are also instances of the related class).
-
Each property may be related to one or more other properties by a sub-property of relationship. Where it is stated that such a relationship exists, the two properties are defined such that whenever the sub-property is part of a property/value pair describing a resource, it follows that the resource is also described using a second property/value pair made up of the property and the value.
-
Each syntax encoding scheme is a class (of literals).
Note that the word "vocabulary" is used here to refer specifically to a set of terms, a set in which the members are properties (elements), classes, vocabulary encoding schemes, and/or syntax encoding schemes.
**Figure 3 - the DCMI vocabulary model**2.4 Notes
A number of things about the model are worth noting:
-
Each non-literal value may be the described resource in a separate description within the same description set - for example, a separate description may provide metadata about the person that is the creator of the described resource. A literal value can not be the described resource in a separate description.
-
The DCMI description set model does not provide an explicit mechanism for indicating the classes of the described resource. Classes of the described resource can either be indicated explicitly using one or more statements in the description or be inferred from the domains of the properties used in the description.
-
The DCMI description set model indicates the distinction between literal values and non-literal values through the presence in a statement of a literal value surrogate or a non-literal value surrogate. For a non-literal value, the DCMI description set model does not provide an explicit mechanism for indicating further the classes of the value. Classes of any given non-literal value can either be indicated explicitly using one or more statements in a separate description about that value or be inferred from the range of the property. For a literal value, the classes of the value can either be indicated explicitly using a syntax encoding scheme of the value string or be inferred from the range of the property.
-
XML content in a value string is indicated using a typed value string with a syntax encoding scheme URI of http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral.
3. Descriptions, description sets and records
The abstract model presented above indicates that each DC metadata description describes one, and only one, resource. This is commonly referred to as the one-to-one principle.
However, real-world metadata applications tend to be based on loosely grouped sets of descriptions (where the described resources are typically related in some way), known here as description sets. For example, a description set might comprise descriptions of both a painting and the artist. Furthermore, it is often the case that a description set will also contain a description about the description set itself (sometimes referred to as 'admin metadata' or 'meta-metadata').
Description sets are instantiated, for the purposes of exchange between software applications, in the form of metadata records, according to one of the DCMI encoding guidelines (for example, XHTML meta tags, XML and RDF/XML) [DCMI-ENCODINGS].
4. Values
A DC metadata value is the physical, digital, or conceptual entity or literal that is associated with a property when a property-value pair is used to describe a resource. For example, a value associated with the Dublin Core™ Creator property is a person, organization or service - a physical entity. A value associated with the Dublin Core™ Date property is a point (or range) in time - a conceptual entity. A value associated with the Dublin Core™ Coverage property is a geographic region or country - a physical entity. A value associated with the Dublin Core™ Subject property is a concept (a conceptual entity) or a physical object or person (a physical entity). A value associated with the FOAF name property is a literal. Each of these entities is a resource.
5. DCMI Abstract Model semantics
Note that this Recommendation does not explicitly define a formal semantics for the DCMI Abstract Model. The intention is that the formal semantics can be defined by reference to the RDF and RDF Schema semantics, as defined in [RDFMT]. The equivalence between some of the notions in the DCMI Abstract Model and the corresponding RDF notions is given in the following table:
DCMI Abstract Model | RDF/RDFS |
---|---|
resource | Class: http://www.w3.org/2000/01/rdf-schema#Resource |
property or element | Class: http://www.w3.org/1999/02/22-rdf-syntax-ns#Property |
class | Class: http://www.w3.org/2000/01/rdf-schema#Class |
syntax encoding scheme | Class: http://www.w3.org/2000/01/rdf-schema#Datatype |
has domain relationship | Property: http://www.w3.org/2000/01/rdf-schema#domain |
has range relationship | Property: http://www.w3.org/2000/01/rdf-schema#range |
sub-property of relationship | Property: http://www.w3.org/2000/01/rdf-schema#subPropertyOf |
sub-class of relationship | Property: http://www.w3.org/2000/01/rdf-schema#subClassOf |
plain value string | Plain literal. See: http://www.w3.org/TR/rdf-concepts/#dfn-plain-literal |
typed value string | Typed literal. See: http://www.w3.org/TR/rdf-concepts/#dfn-typed-literal |
Table 1 - DCMI Abstract Model semantics
Together with the DCMI Recommendation "Expressing Dublin Core™ using the Resource Description Framework (RDF)" [DCRDF], these equivalences form the basis of the formal semantics of the DCMI Abstract Model. However, the details of such a semantics is outside the scope of this Recommendation.
6. Encoding guidelines
Particular encoding guidelines (HTML meta tags, XML, RDF/XML, etc.) [DCMI-ENCODINGS] do not need to encode all aspects of the abstract model described above. However, they should refer to the DCMI Abstract Model and indicate which parts of the model are encoded and which are not.
Encoding guidelines should indicate how a non-literal value can be treated as a described resource in a separate description in those cases where a non-literal value surrogate does not include a value URI.
7. Terminology
This document uses the following terms:
- class (http://www.w3.org/2000/01/rdf-schema#Class)
- A group containing members that have attributes, behaviours, relationships or semantics in common; a kind of category.
- described resource
- A resource that is described by a description.
- described resource URI
- A URI that identifies the described resource.
- description
- One or more statements about one, and only one, resource.
- description set
- A set of one or more descriptions, each of which describes a single resource.
- element (http://www.w3.org/1999/02/22-rdf-syntax-ns#Property)
- A synonym for property. It should be noted that the word element is also commonly used to refer to a structural markup component within an XML document.
- has domain (http://www.w3.org/2000/01/rdf-schema#domain)
- A relationship between a property and a class which indicates that if the property is part of a property/value pair, then it follows that the described resource is an instance of that class.
- has range (http://www.w3.org/2000/01/rdf-schema#range)
- A relationship between a property and a class which indicates that if the property is part of a property/value pair, then it follows that the value is an instance of that class.
- instance of
- A relationship between a resource and a class which indicates a class of which the resource is an instance.
- literal
- An entity which uses a Unicode string as a lexical form, together with an optional language tag or datatype, to denote a resource (i.e. "literal" as defined by RDF [RDF].
- literal value
- A value which is a literal.
- literal value surrogate
- A value surrogate for a literal value, made up of exactly one value string (a literal that encodes the value).
- member of (http://purl.org/dc/dcam/memberOf)
- A relationship between a resource and a vocabulary encoding scheme which indicates that the resource is a member of a set.
- non-literal value
- A value which is a physical, digital or conceptual entity.
- non-literal value surrogate
- A value surrogate for a non-literal value, made up of a property URI (a URI that identifies a property), zero or one value URI (a URI that identifies the non-literal value associated with the property), zero or one vocabulary encoding scheme URI (a URI that identifies the vocabulary encoding scheme of which the value is a member), and zero or more value strings (literals that represent the value).
- plain value string
- A value string without an associated syntax encoding scheme URI.
- property (http://www.w3.org/1999/02/22-rdf-syntax-ns#Property)
- A specific aspect, characteristic, attribute, or relation used to describe resources.
- property URI
- A URI that identifies a single property.
- property/value pair
- The combination of a property and a value, used to describe a characteristic of a resource.
- record
- An instantiation of a description set, created according to one of the DCMI encoding guidelines (for example, XHTML meta tags, XML and RDF/XML).
- resource (http://www.w3.org/2000/01/rdf-schema#Resource)
- Anything that might be identified. Familiar examples include an electronic document, an image, a service (for example, "today's weather report for Los Angeles"), and a collection of other resources. Not all resources are network "retrievable"; for example, human beings, corporations, concepts and bound books in a library can also be considered resources.
- statement
- An instantiation of a property-value pair made up of a property URI (a URI that identifies a property) and a value surrogate.
- sub-class of (http://www.w3.org/2000/01/rdf-schema#subClassOf)
- A relationship between two classes which indicates that the two classes are defined such that all resources that are instances of the sub-class are also instances of the related class).
- sub-property of (http://www.w3.org/2000/01/rdf-schema#SubPropertyOf)
- A relationship between two properties which indicates that the two properties are defined such that whenever the sub-property is part of a property/value pair describing a resource, it follows that the resource is also described using a second property/value pair made up of the property and the value.
- syntax encoding scheme (http://www.w3.org/2000/01/rdf-schema#Datatype)
- A set of strings and an associated set of rules that describe a mapping between that set of strings and a set of resources. The mapping rules may define how the string is structured (for example DCMI Box) or they may simply enumerate all the strings and the corresponding resources (for example ISO 3166).
- syntax encoding scheme URI
- A URI that identifies a syntax encoding scheme.
- term
- A property (element), class, vocabulary encoding scheme, or syntax encoding scheme.
- typed value string
- A value string with an associated syntax encoding scheme URI.
- URI
- A Uniform Resource Identifier [URI] or Internationalized Resource Identifier [IRI]. From the perspective of the DCMI Abstract Model, equivalence of URIs is defined as in RDF [RDF].
- value
- The physical entity, conceptual entity or literal (a resource) that is associated with a property when a property-value pair is used to describe a resource.
- value URI
- A URI that identifies the value.
- value string
- A literal, optionally associated with either a syntax encoding scheme URI or a value string language. In a literal value surrogate a value string encodes the value; in a non-literal value surrogate a value string represents the value.
- value string language
- An ISO language tag that indicates the language of the value string.
- value surrogate
- A literal value surrogate or a non-literal value surrogate.
- vocabulary
- A set of one or more terms.
- vocabulary encoding scheme (http://purl.org/dc/dcam/VocabularyEncodingScheme)
- An enumerated set of resources.
- vocabulary encoding scheme URI
- A URI that identifies a vocabulary encoding scheme.
Appendix A - Relationship to legacy DCMI Grammatical Principles
The underlying model for Dublin Core™ metadata has evolved since first formalisms were proposed in the late 1990s. The following table presents rough terminological equivalences between earlier versions of DCMI grammatical principles [DCMI-GRAM-PRIN] and the current DCMI Abstract Model.
DCMI Grammatical Principles | DCMI Abstract Model |
---|---|
vocabulary term | resource |
element | property or element |
element refinement | property with sub-property of relation |
encoding scheme | syntax encoding scheme or vocabulary encoding scheme |
syntax encoding scheme | syntax encoding scheme |
qualifier | property with sub-property of relation, syntax encoding scheme, or vocabulary encoding scheme |
vocabulary encoding scheme | vocabulary encoding scheme |
Table 2 - DCMI Grammatical Principles and DCMI Abstract Model
References
[DCMI]
Dublin Core™ Metadata Initiative
< http://dublincore.org/>
[DCMI-GRAM-PRIN]
DCMI Usage Board. DCMI Grammatical Principles. November 2003.
< http://dublincore.org/specifications/dublin-core/grammatical-principles/>
[DCMI-ENCODINGS]
DCMI Encoding Guidelines
< http://dublincore.org/schemas/>
[DCRDF]
Nilsson, Mikael, Andy Powell, Pete Johnston, and Ambjörn Naeve. Expressing Dublin Core™ metadata using the Resource Description Framework (RDF). DCMI Proposed Recommendation. April 2007.
< http://dublincore.org/specifications/dublin-core/dc-rdf/>
[IRI]
Duerst, M., M. Suignard. RFC 3987: Internationalized Resource Identifiers (IRIs). Internet Engineering Task Force (IETF). January 2005.
< http://www.ietf.org/rfc/rfc3987.txt>
[RDF]
Klyne, Graham and Jeremy Carroll, editors. Resource Description Framework: Concepts and Abstract Syntax. W3C Recommendation. 10 February 2004.
< http://www.w3.org/TR/rdf-concepts/>
[RDFMT]
Hayes, Patrick, editor. RDF Semantics. W3C Recommendation. 10 February 2004.
< http://www.w3.org/TR/rdf-mt/>
[RDFS]
Brickley, Dan and R.V. Guha, editors. RDF Vocabulary Description Language 1.0: RDF Schema. W3C Recommendation. 10 February 2004.
< http://www.w3.org/TR/rdf-schema/>
[UML]
Booch, Grady, James Rumbaugh and Ivar Jacobson. The Unified Modeling Language User Guide. Addison-Wesley, 1998.
[URI]
Berners-Lee, T., R. Fielding, L. Masinter. RFC 3986: Uniform Resource Identifier (URI): Generic Syntax. Internet Engineering Task Force (IETF). January 2005.
< http://www.ietf.org/rfc/rfc3986.txt>
Acknowledgements
Thanks to Dan Brickley, Rachel Heery, Alistair Miles, Sarah Pulis, the members of the DC Usage Board and the members of the DCMI Architecture Community for their comments on previous versions of this document.
Errata 2007-09-24: Fixed typographical error -- deleted extra "is" in two instances of "which is is".
Errata 2013-02-11: Fixed URL for DCMI-GRAM-PRIN.