innovation in metadata design, implementation & best practices

Makx, Stu,

A quick-and-dirty analysis of files ending in ".rdf" (i.e.,
the metadata files) on makes it immediately
clear why searches on the metadata yield such terrible results:

-- A total of 744 items on have metadata.

-- Of these 744 items, 212 are in the snapshot of the Website that
   was archived on February 2001 (see
   The entire tree should
   probably be excluded from indexing -- this step alone would
   probably improve the quality of the metadata search by 30%!

-- Of the remaining 532 items, 323 are of historical versions and/or
   historical materials for particular workshops or conferences (see
   These should not be indexed.

-- Of the remaining 209 items, 74 should for various reasons definitely not be
   indexed (see,
   for example:

   -- They are at a level of granularity too fine for indexing
      (e.g., the biographies of BoT members, which can be
      discovered from the page).

   -- They are in the or
      trees, which should not be discoverable through a public
      metadata search of the DCMI Web site.

-- There is a large category of things that should definitely
   not be indexed because they are obsolete or superseded, but
   which nonetheless form part of the DCMI historical record
   and therefore should be discoverable by other means (see
   For at least some of these resources there may not be a
   citation path to public Web pages. I am not sure what
   to do about these. One idea would be to create a page,
   linked to, where one could
   simply link these documents without any sort of explanation
   or maintenance required. Perhaps Harry or Lance could use my
   file as a basis for doing this.

-- This leaves just 79 resources that should be indexed today (see

I noted the following problems: -- does not have a date! -- is actively unhelpful -- Is this group still active and is Stu still the chair? -- Is this still active?? -- is this still active?? -- is this being maintained?? -- why does Admin Core imply it is a "DCMI" document? - is this being maintained? -- is this still being maintained? - looks like this is no longer being maintained? - looks like this is no longer being maintained - is this generated (and updated) automatically? - bad link!