Dave Beckett
D.J.Beckett@ukc.ac.uk
Computing Laboratory
University of Kent at Canterbury
Dublin Core Element Substructure
(Why The Dublin Core needs Qualifiers)
The DC has 15 elements that can be used for simple
resource descriptions by unskilled cataloguers. The meanings
of these elements and the values are, however, only broadly
defined. We propose that a DC with Qualifiers is not
just useful but is required for DC to be
taken up in the large.
Qualifiers are:
An unordered set of unique Attribute: Value pairs that are
attached to each element. There are core qualifiers for all
elements and per-element ones.[1]
Early experiences in embedding DC in HTML (hand crafted DC,
Nordic Metadata Project[2,3], ADS[4], ...) show that:
- Qualifiers are already heavily used in most embedded DC in
WWW/HTML pages -- qualifiers for the Creator
/ Author elements for ``white-pages''-like
information; format schemes; language schemes etc.
- There is a need for precision -- the best example is in the
Date element where there must be precise
knowledge of the format so that entries like
01/02/03 have a meaning. This implies a compulsory
standard for the default free text (unlikely to be
enforcable) or a qualifier to specify the date
format being used (which should support four digit years and be
unambiguous).
As an example of the above two points, the DC reference page[5] uses:
<META name = "DC.date" type = "creation" scheme = "ISO" content = "1996-06-02">
(which is not valid HTML 2 however, attributes type and scheme do
not exist) but shows the need for qualification and
standardisation -- i.e. what ISO standard is implied here?
- In the Subject and other elements, there
is a great desire for an ability for the particular
scheme or subject theasaurus to be declared so that
the field can have a precise meaning for subject specialists
(MeSH, LCSH, ...).
- Qualifiers are necessary to define the Language, Encoding,
and Character Repertoire/Set for Internationalization (I18N)
when not using defaults. Others will speak to I18N issues so
I won't expand them here.
- At DC III[6], DLOs were found to
include (static) images / visual resources and for these more
complex qualification of the Format element
will be needed, for example, to describe multiple image
formats, sizes, ... etc.
- Even if qualifiers are not added now, an
escape mechanism must be added to future-proof the
current DC so that when qualifiers are added, existing
non-qualified DC still can work. We do not
recommend this, and expect if one RFC appears, people will
only use that as a basis and ignore later
incompatible ones.
Messy Syntax Issues on Embedding DC in HTML <META> tags
Issue: Where do the qualifiers go in the <META> tag?
The current practice is a mixture of three places, which
is unsatisfactory:
- The NAME attribute
- This needs some syntax defined, has a limited character set
(and size in HTML 2) and is thus restricting on qualifier
names and values (and unsafe for HTML 2 but do we care if
HTML 3.2 is the standard?)
- The CONTENT attribute
- No problems with validation, needs an escaping mechanism, may
look messy / be difficult to explain to unskilled users
however do we expect most people will be using programs to
put DC in HTML?
- New attributes
- Easy to explain but breaks HTML validation and it is
unrealistic to expect we can change HTML to fix this.
Using the CONTENT attribute is the current most popular
experimental method.
Issue: Syntax for qualifiers
For the CONTENT attribute, current suggestions for syntax are
a list of prefixes of the form:
(Qualifier Name=Qualifier Value)
This is easy to write and flexible has some problems with
encoding when an element value begin with a (. There
are three suggested ways to fix this:
- Use white space to separate qualifiers and element value
This is simple but has the problem that white space tends
to be added/removed from data by programs silently.
- Encode the ( by using the HTML character
entity for ( - (
This may not work since an SGML parser could legitimately
replace it with ( silently.
- Encode the ( using another encoding such as the
%hex encoding for URLs
This needs more thought - do we just encode the 1st character
of the value in % format if it is '(' or '%' or all
of the value?
- Encode a leading ( by duplicating it ((
This has the advantage that it is unambiguous -- no qualifier
can be in this format but may be difficult to explain.
For the NAME attribute, current suggestions for syntax are:
- Element.Special Qualifier Value.Qualifier Name.Qualifier Value....
Where the special qualifiers such as Role for
Contributor have reserved positions (can be
omitted with ..) Disadvantages: each element has
different special qualifiers so the result will be a
confusing mix of qualifiers with some having the qualifier
names and others just the value. Advantages: Simple
- Element.Qualifier Name.Qualifier Value....
Disadvantages: Rather long. Advantages: Orthogonal.
Acknowledgements
This work derives a lot from [7]
and discussions with Jon Knight of ROADS, Paul Miller of ADS
and Misha Wolf of Reuters.
References
- [1] Dublin Core Qualifiers
- by Jon Knight and Martin Hamilton, ROADS Project, Department of Computer Studies, Loughborough University at <URL:http://www.roads.lut.ac.uk/Metadata/DC-Qualifiers.html>
- [2] The Nordic Metadata Project
- at <URL:http://linnea.helsinki.fi/meta/index.html>
- [3] Nordic Metadata Project - Dublin Core Metadata Template
- at <URL:http://www.ub2.lu.se/~traugott/DC_creator.html>
- [4] An application of Dublin Core from the Archaeology Data Service
- by Paul Miller, University Computing Service, University of Newcastle, UK at <URL:http://intarch.ac.uk/ahds/project/metadata/dublin.html>
- [5] Dublin Core Metadata Element Set: Reference Description
- at <URL:http://purl.org/metadata/dublin_core_elements>
- [6] CNI/OCLC Metadata Workshop Workshop on Metadata for Networked Images
- <URL:http://purl.oclc.org/metadata/image>
- [7] Strawman Proposal for Defining Qualifiers for Dublin Core Elements
- by Jon Knight, ROADS Project, Department of Computer Studies, Loughborough University at <URL:http://www.roads.lut.ac.uk/Metadata/DC-QualProposal.html>