2012-01-05. Frozen archive - links may not resolve - see directory of files at MoinMoin wiki archive

> DCText/2006-05-21

DC-Text: A Text Syntax for Dublin Core Metadata

This document is part of the [Self]DC Architecture Wiki.

IMPORTANT: Do not cite materials in this Wiki other than for the purposes of collaborating on document creation. This Wiki is intended to be used to work on draft copies of documents. Finished documents will be published, in a persistent and citable form, on the dublincore.org Web site (or elsewhere in some cases).

This is a draft document, currently being worked on by the [WWW]Joint DCMI/IEEE LTSC Taskforce and the [Self]DC RDF Taskforce. Comments should be sent to the DC-ARCHITECTURE@jiscmail.ac.uk mailing list or direct to the authors.

Title: DC-Text: A Text Syntax for Dublin Core Metadata
Creator: Pete Johnston, Eduserv Foundation <pete.johnston@eduserv.org.uk>
Date Issued: 2006-05-21
Identifier: http://dublincore.org/architecturewiki/DCText/2006-05-21
Replaces: Not applicable
Is Replaced By: Not applicable
Latest Version: http://dublincore.org/architecturewiki/DCText
Description of Document: This document specifies a simple text format for representing a Dublin Core metadata description set. The format is known as "DC-Text".

Contents

  1. Introduction

  2. The DCMI Abstract Model and DC-Text

  3. The DC-Text Syntax

  4. Examples

  5. Appendix A: DC-Text in BNF

  6. Notes

  7. References

1. Introduction

The DCMI Abstract Model [DCAM] describes the components which make up DC metadata description sets and the relationships between them. This document specifies a syntax for serialising, or representing, a DC metadata description set in plain text. The format is referred to as "DC-Text" A plain text format for serialisation of such description sets is useful as a means of presenting examples in a way which highlights the constructs of the DCMI Abstract Model, and also as a means of comparing the information represented in other formats such as DC-XML, RDF/XML and XHTML/HTML.

2. The DCMI Abstract Model (Summary)

According to the DCMI Abstract Model [DCAM]:

3. The Syntax

A formal description of the DC-Text syntax is presented in Appendix A This section presents an overview of the syntax and a set of examples illustrating how the various constructs of the DCMI Abstract Model are represented.

3.1 The Structure of a DC-Text Document

The general structure of a DC-Text document is as follows:

namespace declaration
label (
  label ( content )
  label (
    label ( [...] )
    [ ... ]
  )
)

Each of the primary components of a DC metadata description set defined by the DCMI Abstract Model is represented in DC-Text by a syntactic structure of the form:

label ( content )

where label is replaced by one of the following strings:

DescriptionSet, Description, DescriptionId, ResourceURI,
Statement, PropertyURI, DescriptionRef, VocabularyEncodingSchemeURI,
ValueURI, ValueString, Language, SyntaxEncodingSchemeURI,
RichRepresentation, Base64, MIME

and content is either:

For each label value in the list above, the permitted form of content is determined by the syntax rules specified in Appendix A. These are explained through examples below.

The DC-Text syntax supports the representation of a single DC description set, so a DC-Text document consists of zero or more namespace declarations followed by a single label( content ) syntactic structure with a label of DescriptionSet, and as content, one or more nested label( content ) structured with a label of Description. I.e. a DC-Text document has the following outline form:

@prefix prefix: <uri> .

DescriptionSet (
  Description (
    Statement ( ... )
    Statement ( ... )
  )
  Description (
    Statement ( ... )
    Statement ( ... )
  )
)
3.1 URIs, Qualified Names, and Namespace Declarations

The DCMI Abstract Model uses URIs to refer to resources and to metadata terms (properties, vocabulary encoding schemes and syntax encoding schemes). In the DC-Text syntax, URIs may be written in full or may be represented as "qualified names". A qualified name is made up of two parts, a prefix and a name, separated by a colon (:). In DC-Text, wherever a qualified name is used, it is used to represent a URI. The URI represented by the qualified name is determined by appending the name part of the qualified name to the URI with which the prefix is associated in a namespace declaration (sometimes called the namespace URI).

Namespace declarations occur at the start of a DC-Text document, and have the following form:

@prefix prefix: <uri>

For example, the following declarations associates the prefix dc with the URI http://purl.org/dc/elements/1.1/ and the prefix ex with the URI http://example.org/resources/

@prefix dc: <http://purl.org/dc/elements/1.1/>
@prefix ex: <http://example.org/resources/>

Note that the limitations on the characters which can occur in the name part of a qualified name mean that there are URIs that can not be expressed as qualified names. For example the URIs http://example.org/resources/12345 and http://example.org/resources#12345 can not be represented as qualified names, because the name part can not include the "/" or "#" characters, and can not begin with a numeric character.

3.3 Comments

Comments can be inserted anywhere in a DC-Text document. A comment starts with a # and ends with a newline.

# A comment at the start of the document
@prefix prefix: <uri> .
DescriptionSet (
  Description (
    # A comment at the start of a description
    Statement ( ... )
    # A comment following a statement
    Statement ( ... )
  )
  Description (
    Statement ( ... )
    Statement ( ... )
  )
)
3.4 String Escapes

To be provided.

4. Examples

This section provides examples of how the DC-Text syntax represents all the constructs of the DCMI Abstract Model.

4.1 Statements Using Value Strings and Value URIs

The first example is of a description set containing a single description with a single simple statement with a property URI and a value string to represent the value:

DescriptionSet (
  Description (
    Statement (
      PropertyURI ( <http://purl.org/dc/elements/1.1/title> )
      ValueString ( "DCMI Home Page" )
    )
  )
)

Example 1: Value Strings

The second example introduces a resource URI which identifies the subject of the description, using the ResourceeURI ( <uri> ) syntactic structure:

DescriptionSet (
  Description(
    ResourceURI( <http://dublincore.org/pages/home> )
    Statement (
      PropertyURI ( <http://purl.org/dc/elements/1.1/title> )
      ValueString ( "DCMI Home Page" )
    )
  )
)

Example 2: Resource URI

By introducing namespace declarations, the qualified name mechanism can be used to abbreviate both the resource URI and the property URI. The same description set as in the previous example might be encoded as follows.

@prefix page: <http://dublincore.org/pages/> .
@prefix dc: <http://purl.org/dc/elements/1.1/> .

DescriptionSet (
  Description (
    ResourceURI ( page:home )
    Statement (
      PropertyURI ( dc:title )
      ValueString ( "DCMI Home Page" )
    )
  )
)

Example 3: Qualified Names

The value string may have an associated language tag, represented using the Language( tag ) syntactic structure:

@prefix page: <http://dublincore.org/pages/> .
@prefix dc: <http://purl.org/dc/elements/1.1/> .

DescriptionSet (
  Description (
    ResourceURI ( page:home )
    Statement (
      PropertyURI ( dc:title )
      ValueString ( "DCMI Home Page"
        Language ( en-GB )
      )
    )
  )
)

Example 4: Language Tags

A single statement may include multiple value strings to represent the value. In DC-Text this is represented by repeating the ValueString ( "literal" ) syntactic structure:

@prefix page: <http://dublincore.org/pages/> .
@prefix dc: <http://purl.org/dc/elements/1.1/> .

DescriptionSet (
  Description (
    ResourceURI ( page:home )
    Statement (
      PropertyURI ( dc:title )
      ValueString ( "DCMI Home Page"
        Language ( en-GB )
      )
      ValueString ( "El Home Page de DCMI"
        Language ( es-ES )
      )
    )
  )
)

Example 5: Multiple Value Strings

A statement may include a value URI to identify the value, using the ValueURI ( <uri> ) syntactic structure:

@prefix page: <http://dublincore.org/pages/> .
@prefix agent: <http://example.org/agents/> .
@prefix dc: <http://purl.org/dc/elements/1.1/> .
DescriptionSet (
  Description (
    ResourceURI ( page:home )
    Statement (
      PropertyURI ( dc:title )
      ValueString ( "DCMI Home Page"
        Language ( en-GB )
      )
      ValueString ( "El Home Page de DCMI"
        Language( es-ES )
      )
    )
    Statement(
      PropertyURI ( dc:creator )
      ValueURI ( agent:DCMI )
    )
  )
)

Example 6: Value URIs

4.2 Vocabulary and Syntax Encoding Scheme URIs

A statement may include a vocabulary encoding scheme URI to specify the type of the value, a class of which the value is an instance. In DC-Text this is represented using the VocabularyEncodingSchemeURI ( <uri> ) syntactic structure:

@prefix page: <http://dublincore.org/pages/> .
@prefix agent: <http://example.org/agents/> .
@prefix dc: <http://purl.org/dc/elements/1.1/> .
@prefix dcterms: <http://purl.org/dc/terms/> .
DescriptionSet (
  Description (
    ResourceURI ( page:home )
    Statement (
      PropertyURI ( dc:title )
      ValueString ( "DCMI Home Page"
        Language ( en-GB )
      )
      ValueString( "El Home Page de DCMI"
        Language ( es-ES )
      )
    )
    Statement (
      PropertyURI ( dc:creator )
      Value URI ( agent:DCMI )
    )
    Statement (
      PropertyURI ( dc:subject )
      VocabularyEncodingSchemeURI ( dcterms:LCSH )
      ValueString ( "Information technology")
    )
  )
)

Example 7: Vocabulary Encoding Scheme URIs

A value string may have an associated syntax encoding scheme URI, using the SyntaxEncodingSchemeURI( <uri> ) syntactic structure:

@prefix page: <http://dublincore.org/pages/> .
@prefix agent: <http://example.org/agents/> .
@prefix dc: <http://purl.org/dc/elements/1.1/> .
@prefix dcterms: <http://purl.org/dc/terms/> .
@prefix xs: <http://www.w3.org/2001/XMLSchema#> .
DescriptionSet (
  Description (
    ResourceURI ( page:home )
    Statement (
      PropertyURI ( dc:title )
      ValueString ( "DCMI Home Page"
        Language ( en-GB )
      )
      ValueString( "El Home Page de DCMI"
        Language ( es-ES )
      )
    )
    Statement(
      Property URI( dc:creator )
      Value URI ( agent:DCMI )
    )
    Statement (
      PropertyURI ( dc:subject )
      VocabularyEncodingSchemeURI ( dcterms:LCSH )
      ValueString ( "Information technology" )
    )
    Statement (
      PropertyURI ( dcterms:modified )
      ValueString ( "2006-02-14"
        SyntaxEncodingSchemeURI ( xs:date )
      )
    )
  )
)

Example 8: Syntax Encoding Scheme URIs

4.3 Multiple Descriptions in Description Set

A description set may contain multiple descriptions, represented by a list of Description( content ) syntactic structures. The order has no significance.

@prefix page: <http://dublincore.org/pages/> .
@prefix dc: <http://purl.org/dc/elements/1.1/> .
DescriptionSet (
  Description (
    ResourceURI ( page:home )
    Statement (
      PropertyURI ( dc:title )
      ValueString ( "DCMI Home Page" )
    )
  )
  Description (
    ResourceURI ( page:althome )
    Statement (
      PropertyURI ( dc:title )
      ValueString ( "DCMI Alternative Home Page" )
    )
  )
)

Example 9: Multiple Descriptions

A description may be about a resource which is a value in a statement in another description within the description set. If the resource has been assigned a URI, then that URI appears as a value URI in the statement where the resource is the value and as a resource URI in the description of that resource.

@prefix page: <http://dublincore.org/pages/> .
@prefix agent: <http://example.org/agents/> .
@prefix dc: <http://purl.org/dc/elements/1.1/> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
DescriptionSet (
  Description (
    ResourceURI ( page:home )
    Statement (
      PropertyURI ( dc:title )
      ValueString ( "DCMI Home Page" )
    )
    Statement (
      PropertyURI ( dc:creator )
      ValueURI ( agent:DCMI )
    )
  )
  Description (
    ResourceURI ( page:althome )
    Statement (
      PropertyURI ( dc:title )
      ValueString ( "DCMI Alternative Home Page" )
    )
    Statement (
      PropertyURI ( dc:creator )
      ValueURI ( agent:DCMI )
    )
  )
  Description (
    ResourceURI ( agent:DCMI )
    Statement (
      PropertyURI ( foaf:name )
      ValueString ( "Dublin Core Metadata Initiative" )
    )
  )
)

Example 10: Multiple Related Descriptions

In some cases it may be that a resource does not have a URI assigned to it. Such a resource may still be a value in a statement, and the subject of another description. In DC-Text, the association between the value of one statement and the description of that resource is made by labelling the description using a DescriptionId ( id ) syntactic structure. The id may then be cited using the DescriptionRef ( id ) syntactic structure in one or more statements elsewhere in the same description set.

Note that this is a syntactic mechnism for linking references to values to their descriptions: the id itself does not appear in the DCMI Abstract Model.

@prefix page: <http://dublincore.org/pages/> .
@prefix agent: <http://example.org/agents/> .
@prefix dc: <http://purl.org/dc/elements/1.1/> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
DescriptionSet (
  Description (
    ResourceURI ( page:home )
    Statement (
      PropertyURI ( dc:title )
      ValueString ( "DCMI Home Page" )
    )
    Statement (
      PropertyURI ( dc:creator )
      DescriptionRef ( descDCMI )
    )
  )
  Description (
    ResourceURI ( page:althome )
    Statement (
      PropertyURI ( dc:title )
      ValueString ( "DCMI Alternative Home Page" )
    )
    Statement (
      PropertyURI ( dc:creator )
      DescriptionRef ( descDCMI )
    )
  )
  Description (
    DescriptionId ( descDCMI )
    Statement (
      PropertyURI ( foaf:name )
      ValueString ( "Dublin Core Metadata Initiative" )
    )
  )
)

Example 11: Multiple Related Descriptions

4.4 Rich Representations

A value may be represented not simply by a value string, but also by a rich represntation: an XML fragment or a piece of binary data.

In DC-Text, an XML fragment is provided using the RichRepresentation ( "literal" ) syntactic structure:

@prefix page: <http://dublincore.org/pages/> .
@prefix dc: <http://purl.org/dc/elements/1.1/> .
DescriptionSet (
  Description (
    ResourceURI ( page:home )
    Statement (
      PropertyURI ( dc:description )
      RichRepresentation ( "<p xmlns=\"http://www.w3.org/1999/xhtml\">This is the DCMI<br />Home Page</p>" )
    )
  )
)

Example 12: Rich Representations - XML

In DC-Text, a binary data object is enocoded as a Base64-encoded literal and represented using the Base64 ( "literal" MIME ( "MIME-type" ) ) syntactic structure:

@prefix page: <http://dublincore.org/pages/> .
@prefix dc: <http://purl.org/dc/elements/1.1/> .
DescriptionSet (
  Description (
    ResourceURI ( page:home )
    Statement (
      PropertyURI ( dc:description )
      RichRepresentation ( 
        Base64 ( "abcdefghij" MIME ( "image/png" ) )
      )
    )
  )
)

Example 13: Rich Representations - Binary Data

Appendix A. Grammar

A DC-Text document is a sequence of Unicode characters encoded in UTF-8 defined by the grammar below. It is specified by means of the version of Extended BNF used in XML 1.0 (Third Edition) [XML]

DC-Text - EBNF

[1] dcTextDoc ::= comment* ws* namespaceDeclaration* ws* comment* ws* descriptionSet ws* comment* ws*
[2] namespaceDeclaration ::= '@prefix' ws+ prefixName? ':' ws+ uriref
[3] descriptionSet ::= 'DescriptionSet(' ws* comment* ws* description ws* comment* ws* ')'
[4] description ::= 'Description(' ws* comment* ws* (descriptionId ws* comment* ws*)? (resourceURI ws* comment* ws*)? (statement ws* comment* ws*)+ ')'
[5] statement ::= 'Statement(' ws* comment* ws* propertyURI ws* comment* ws* (vocabEncSchemeURI ws* comment* ws*)? (valueURI ws* comment* ws*)? (valueRepresentation ws* comment* ws*)* (descriptionReference ws* comment* ws*)? ')'
[6] valueRepresentation ::= valueString | richRepresentation
[7] valueString ::= 'ValueString(' ws* comment* ws* quotedString ws* comment* ws* (languageTag ws* comment* ws*)? (valueURI ws* comment* ws*)? (syntaxEncSchemeURI ws* comment* ws*)? ')'
[8] languageTag ::= 'Language(' ws* [language language] ws* ')'
[9] richRepresentation ::= 'RichRepresentation(' ws* comment* ws* ( quotedString | Base64 ) ws* comment* ws* ')'
[10] base64 ::= 'Base64 (' ws* comment* ws* quotedString ws* mime comment* ws* ')'
[11] mime ::= 'mime (' ws* quotedString ws* ')'
[12] descriptionReference ::= 'DescriptionRef(' ws* name ws* ')'
[13] descriptionId ::= 'DescriptionId(' ws* name ws* ')'
[14] resourceURI ::= 'ResourceURI(' ws* resourceRef ws* ')'
[15] propertyURI ::= 'PropertyURI(' ws* resourceRef ws* ')'
[16] valueURI ::= 'ValueURI(' ws* resourceRef ws* ')'
[17] vocabEncSchemeURI ::= 'VocabEncSchemeURI(' ws* resourceRef ws* ')'
[18] syntaxEncSchemeURI ::= 'SyntaxEncSchemeURI(' ws* resourceRef ws* ')'
[19] resourceRef ::= uriref | qualifiedName
[20] qualifiedName ::= prefixName? ':' name?

The following productions are as defined by Turtle [TURTLE]:

DC-Text - EBNF (continued)

[21] comment ::= '#' ( [^#xA#xD] )*
[22] ws ::= #x9 | #xA | #xD | #x20
[23] uriref ::= '<' relativeURI '>'
[24] language ::= [a-z]+ ('-' [a-z0-9]+ )*
[25] nameStartChar ::= [A-Z] | "_" | [a-z] | #x00C0-#x00D6 | #x00D8-#x00F6 | #x00F8-#x02FF | #x0370-#x037D | #x037F-#x1FFF | #x200C-#x200D | #x2070-#x218F | #x2C00-#x2FEF | #x3001-#xD7FF | #xF900-#xFDCF | #xFDF0-#xFFFD | #x10000-#xEFFFF
[26] nameChar ::= nameStartChar | '-' | [0-9] | #x00B7 | #x0300-#x036F | #x203F-#x2040
[27] name ::= nameStartChar nameChar*
[28] prefixName ::= ( nameStartChar - '_' ) nameChar*
[29] relativeURI ::= ucharacter*
[30] quotedString ::= string | longString
[31] string ::= #x22 scharacter* #x22
[32] longString ::= #x22 #x22 #x22 lcharacter* #x22 #x22 #x22
[33] character ::= '\u' hex hex hex hex |
'\U' hex hex hex hex hex hex hex hex |
'\\' |
#x20-#x5B | #x5D-#x10FFFF
See String Escapes for full details.
[34] echaracter ::= character | '\t' | '\n' | '\r'
See String Escapes for full details.
[35] hex ::= #x30-#x39 | #x41-#x46
hexadecimal digit (0-9, uppercase A-F)
[36] ucharacter ::= ( character - #x3E ) | '\>'
[37] scharacter ::= ( echaracter - #x22 ) | '\"'
[38] lcharacter ::= echaracter | '\"' | #x9 | #xA | #xD

Notes

References

[DCAM]
DCMI Abstract Model
http://dublincore.org/documents/abstract-model/

[XML]
Extensible Markup Language (XML) 1.0 (Third Edition). W3C Recommendation 04 February 2004.
http://www.w3.org/TR/REC-xml

[XMLS]
XML Schema Part 0: Primer Second Edition. W3C Recommendation 28 October 2004.
http://www.w3.org/TR/xmlschema-0/

[TURTLE]
Turtle - Terse RDF Triple Language
http://www.dajobe.org/2004/01/turtle/