innovation in metadata design, implementation & best practices
A quick-and-dirty analysis of files ending in ".rdf" (i.e.,
the metadata files) on http://dublincore.org makes it immediately
clear why searches on the metadata yield such terrible results:
-- A total of 744 items on http://dublincore.org/ have metadata.
-- Of these 744 items, 212 are in the snapshot of the dublincore.org Website that
was archived on February 2001 (see
The entire tree http://dublincore.org/archives/ should
probably be excluded from indexing -- this step alone would
probably improve the quality of the metadata search by 30%!
-- Of the remaining 532 items, 323 are of historical versions and/or
historical materials for particular workshops or conferences (see
These should not be indexed.
-- Of the remaining 209 items, 74 should for various reasons definitely not be
indexed (see http://dublincore.org/usage/meetings/2004/03/ISSUES/WEBSITE/no.html),
-- They are at a level of granularity too fine for indexing
(e.g., the biographies of BoT members, which can be
discovered from the http://dublincore.org/about/ page).
-- They are in the http://dublincore.org/advisoryboard/ or
trees, which should not be discoverable through a public
metadata search of the DCMI Web site.
-- There is a large category of things that should definitely
not be indexed because they are obsolete or superseded, but
which nonetheless form part of the DCMI historical record
and therefore should be discoverable by other means (see
For at least some of these resources there may not be a
citation path to public Web pages. I am not sure what
to do about these. One idea would be to create a page
linked to http://dublincore.org/archives/, where one could
simply link these documents without any sort of explanation
or maintenance required. Perhaps Harry or Lance could use my
file as a basis for doing this.
-- This leaves just 79 resources that should be indexed today (see
I noted the following problems:
http://dublincore.org/documents/dcmi-ieee-mou/ -- does not have a date!
http://dublincore.org/documents/dcmi-structure/ -- is actively unhelpful
http://dublincore.org/groups/agents/ -- Is this group still active and is Stu still the chair?
http://dublincore.org/groups/biz/ -- Is this still active??
http://dublincore.org/groups/kernel/ -- is this still active??
http://dublincore.org/resources/bibliography/ -- is this being maintained??
http://dublincore.org/links/ -- why does Admin Core imply it is a "DCMI" document?
http://dublincore.org/meetings/ - is this being maintained?
http://dublincore.org/news/communications/deliverables.shtml -- is this still being maintained?
http://dublincore.org/news/newsletter/ - looks like this is no longer being maintained?
http://dublincore.org/news/projects.shtml - looks like this is no longer being maintained
http://dublincore.org/sitemap.shtml - is this generated (and updated) automatically?
http://dublincore.org/templates/rdf/example.shtml - bad link!