innovation in metadata design, implementation & best practices

Title: DCMI property domains and ranges
Identifier: /admin/www/usage/meetings/2006/04/seattle/domains-ranges/
Created: 2006-03-30

Shepherd: Andy

[1] /architecturewiki/DCPropertyDomainsRanges
[2] /architecturewiki/DCRDFTaskforce/DCRDFExecutiveSummary3
[3] /usage/meetings/2006/04/seattle/domains-ranges/2006-03-24.domain-range-rationale.html
[4] /usage/meetings/2006/04/seattle/domains-ranges/2006-03-28.domain-range-comments.html
[5] /usage/meetings/2006/04/seattle/domains-ranges/2006-03-24.dcPropertiesRanges.pdf
[6] /usage/meetings/2006/04/seattle/domains-ranges/2005-02-26.Encoding-scheme-types.html

See also:
[7] /usage/meetings/2006/04/seattle/domains-ranges/2006-03-11.educationLevel.txt
[8] /usage/meetings/2006/04/seattle/dcmitype/
[9] /usage/public-comment/2006/03/type-vocabulary-comments/
[10] /usage/public-comment/2005/12/type-vocabulary-changes/

In his draft DCPropertyDomainsRanges [1], Andy has proposed
domains and ranges for all DCMI terms. In Seattle, our goal
should be to decide whether this is a reasonable thing to do.
We should weigh whether we want to undertake this exercise
at all and assess its implications.

Many of the domain and range proposals are probably
uncontroversial. Some notable exceptions:

-- dcterms:educationalLevel
-- dc:creator and dc:creator (see discussion DCRDFExecutiveSummary3 [2])

One important implication is that many of the 30-40
"possible classes" proposed as domains and ranges would
need to be defined, approved, given URIs, formally declared,
and maintained. Should DCMI do this? Giving them URIs would
involve either expanding the DCMI Type Vocabulary or creating
a new Vocabulary. Creating a new Vocabulary would involve
revising the DCMI Namespace Policy.

In a conference call, Diane voiced concern with legacy
implementations. Adding domains and ranges would seem to
introduce more complexity in terms of what is considered
"appropriate data" and suggests that we should not just assume
that because it is a good thing technically, we should actually
go ahead and do it.

Giving definitions to these classes would also involve
deciding on a style for definitions. Ideally, this style
would be consistent between this new Domain-Range Vocabulary
and the existing DCMI Type Vocabulary. Since we happen to be 
finalizing a set of changes to the DCMI Type Vocabulary right
now [7], we have an opportunity to ensure that a common style is
adopted. Possibilities include:

   -- The style currently proposed for the DCMI Type
      Vocabulary [10], e.g.:

      -- Collection: An aggregation of items.
      -- Dataset: Information encoded in a defined structure.
      -- Image: A primarily symbolic visual representation 
         other than text.

   -- DCMI Type Style, Renaud style [9], e.g.:

      -- Collection: A resource which is an aggregation of items.
      -- Dataset: A resource in which data is encoded in a
         defined structure.
      -- Image: A resource which is a visual, non-textual

   -- Domain-Range Vocabulary style [1], e.g.:

      -- DigitalResource: The class of all digital resources.

      -- Collection: The class of all aggregations of items.
      -- Dataset: The class of all data encoded in a defined
      -- Image: The class of all visual, non-textual

These alternatives are, of course, not just "stylistic"
changes. Some of them involve making explicit what is
currently implicit in a definition.

Mikael has obtained statistical indexing data from Swoogle
[5], included in the packet as input to this discussion.

It should also be noted that currently, the Web page at
/documents/dcmi-terms/ asserts itself
to be "an up-to-date, authoritative specification of all
metadata terms maintained by DCMI".

DCMI currently makes no such claims for the RDF schema
representation of its terms. Indeed, the only policy statement
on the subject, at /schemas/rdfs/, says
that "users of RDF guidelines and schemas posted on the DCMI
Web site need to be aware that these resources may be subject
to change based on the results of further discussions within
DCMI and W3C" -- a situation that can hopefully be remedied
by the work of the DC RDF Task force.

If in addition to the "natural-language" definitions
currently provided in the Web documents, DCMI were also
to provide "definitive" RDF schemas, then DCMI would be
saying, in effect, that its terms are defined not just by
natural-language definitions, but also by the sum of formal
assertions and relations, within which the terms are embedded,
as expressed in the RDF schema.

We would need to consider whether it would be realistic for
DCMI to claim that both the Web document and the RDF schema are
"authoritative" -- raising the bar for keeping the documents
not only in synch, but for expressing formal assertions
adequately in the Web documents -- or whether one should be
definitive while the other is considered to be derived.

The finalization of authoritative domain and range declarations
has implications for DCMI process, as they would presumably
be subject to review and maintenance by the Usage Board.

As of the 2006-03-23 telecon, this document now belongs to
the Usage Board.

As one piece of unfinished business in this regard, we
should recall that the existing encoding schemes are still
designated as "encoding schemes" in a generic sense and are
not yet differentiated into "vocabulary encoding schemes"
and "syntax encoding schemes" [6].

ACTION 2006-03-23. Andy Consider removing the FRBR-related
classes (see discussion in [4]).