innovation in metadata design, implementation & best practices

Title: "Vocabulary Encoding Scheme Registration" issue
Modified: 2004-03-22 09:41, Monday
Maintainer: Tom Baker
Latest version: http://dublincore.org/usage/meetings/2004/03/ISSUES/registration/
See also: http://dublincore.org/usage/meetings/2004/03/ISSUES/
Description: Evolving summary of the VESR issue from a UB point
                   of view. Past meeting actions are summarized in 
                   Appendix A, followed by a Bibliography.

Shepherd: Traugott Koch

SUMMARY (Tom)

In Bath, we will start by discussing first the possible
implications of InfoURI on the very idea of setting up a
DCMI registry. The following two documents have therefore
been put into the main meeting packet:

    http://www2.elsevier.co.uk/~tony/info/info.html
    http://dublincore.org/usage/meetings/2004/03/Weibel.InfoURI.Registry.pdf

If we conclude that InfoURI will not meet the need we have
long recognized, only then should we proceed to a discussion
of the steps to be undertaken and processes by which a DCMI
registration service would function. The main sources for
that discussion are the "Overview of Next Steps" (below)
and the following two documents in the supplementary packet:

    http://www.lub.lu.se/~traugott/drafts/vocab-scheme-Jan04.html
    http://www.lub.lu.se/~traugott/drafts/vocab-guide6.html

------------------------------------------------------------------------
OVERVIEW OF NEXT STEPS
------------------------------------------------------------------------

In Seattle, the UB agreed to proceed with the fast-track system
to "register" controlled vocabularies as Vocabulary Encoding
Schemes (VESes)(see Appendix point A.8). This entails:

1. Finalize the Web tool
   1.1 Work with Harry to improve interface and functionality
2. Put into place the necessary documentation
   2.1 Update "Guidelines for Registration"
   2.2 DCMI namespace policy
       2.2.1 Modify policy, adding "http://purl.org/dc/schemes"
       2.2.2 Duplication of legacy URIs in the new namespace
   2.3 Process for merging of "registry" output into raw UB data
   2.4 Formulate and document a "good-neighbor" policy
   2.5 Reflect a "good neighbor policy" in documentation and schemas
   2.6 Set up and clarify use of JISCMAIL archive for audit trail
   2.7 Clarify criteria and processes for vetting proposals
3. Approve an initial set of encoding schemes
4. Manage the above tasks and represent the project to the public
    4.1 Create a one-stop Web page for the Registration project
    4.2 Clarify who will actually do what
    4.3 Assume overall project-management responsibility
5. Plan a workshop (out of UB scope per se)

NEXT STEPS IN DETAIL

1. Finalize the Web tool

   1.1 TASK (Traugott): As of 2004-01-04, Traugott has updated
       his summary of development work needed on the Web-based
       registration tool and will follow up with Harry
       and Stu to (hopefully) complete in January-February.
       See http://www.lub.lu.se/~traugott/drafts/vocab-scheme-Jan04.html
       and http://wip.dublincore.org/schemes/index.html.
       (See also Appendix points A.6 and A.14.)

2. Put into place the necessary documentation

   2.1 "Guidelines for registration of Vocabulary Encoding Schemes"

       TASK (Traugott): As of 2004-01-10, Traugott
       has updated [GUIDELINES] and will post
       a draft to DC-USAGE for comment. See
       http://www.lub.lu.se/~traugott/drafts/vocab-guide6.html.

       NOTE: We should look closely at the guidelines
       it provides on forming and proposing a "Name" for
       a vocabulary -- for example, appending a language
       suffix such as "-fr", etcetera. I believe there are
       some unresolved issues here with regard to the use
       of "date-stamped" URIs or of reflecting the numbers
       of specific versions of vocabularies. After talking
       with Traugott, I believe this was the intention behind
       the Seattle Action Item 13 for Traugott to draft "a
       document setting out guidelines for the creation of
       URIs for encoding schemes" (see A.12 below).

   2.2 DCMI Namespace Policy

      According to the policy as it currently stands, all new
      Encoding Schemes go into the http://purl.org/dc/terms/.
      However, now that we expect the creation of many new
      encoding schemes rather quickly and according to a
      new fast-track procedure, we need a new namespace for
      encoding schemes (i.e., http://purl.org/dc/schemes/).
      This change and addition to the namespace policy is on
      the critical path -- not just for going into production
      to register encoding schemes but even before we can
      finalize the related texts and guidelines.

      This breaks down into the following tasks:

      2.2.1 TASK (who?): Modify the Namespace Policy
            [DCMI-NAMESPACE] as follows:

            -- a name for the new namespace;
            -- clarify whether any aspect of the existing policy
               needs to be modified with respect to encoding 
               schemes that are approved according to the fast-track 
               procedure;
            -- shepherd the draft through online DC-USAGE discussion;
            -- liaise with Makx for Directorate approval;
            -- present the results (hopefully completed?) in Bath.

     2.2.2 TASK (who?): Duplication of legacy URIs under
           "http://purl.org/dc/terms" in the new namespace
           "http://purl.org/dc/schemes". This decision
           (and its implications) needs to be implemented
           and documented:

           -- gather from the Directorate, Tom, Roland, and mailing 
              lists any existing documentation of past discussions
              and edit them into a one-page clarification;
           -- include documentation of how the equivalence would
              be declared in the formal RDF term declarations;
           -- liaise with Tom (for the raw UB data in XML) and 
              Harry (for the generated RDF schemas and Web pages) 
              about declaring and documenting the equivalence;
           -- after consulting with the Directorate on process, 
              shepherd the one-pager through list discussion, approval,
              and posting on the Web, ideally in time for the Bath
              meeting.

   2.3 TASK (Tom): Merging of "registry" output into raw
       UB data (see also A.7).

       According to our current model, information about
       VESes will be recorded in two places:

       -- in the back-end database to the Web tool.
          This database will include administrative information
          -- e.g., who submitted a proposal and when.
          The database will periodically output a listing
          of new VESes in a form that can _automatically_
          be merged into the raw UB data (currently in XML).

          NOTE: This functionality is on the critical path
          to using the Web tool -- if we cannot merge data
          from the Web tool into the system of XSLT scripts
          currently used to generate updated Web pages
          and RDF schemas of the DCMI terms automatically,
          I do not currently see a way to manage updates to
          our documentation with a reasonable and sustainable
          level of effort. If we cannot automate the workflow
          from registration through to final publication so as
          to sustain a reasonable and efficient throughput,
          we should not really embark on this adventure to
          begin with.

       -- in the formal RDF term declarations and related
          Web pages generated from the raw UB data (which,
          in turn, is automatically generated from the
          database above).

       This entails the following:

       -- liaison with Traugott to clarify whether any
          attributes need to be exported beyond those already
          used to describe existing encoding schemes;
       -- decide in discussion on DC-USAGE or in Bath at what
          frequency terms documents should be updated to show
          new VESes (this entails liaison with maintainers of 
          the DCMI Registry about the availability of new terms
          in the registry database);
       -- liaise with Harry and the Web Team to clarify how
          often and by what workflow descriptions of VESes
          will be exported from the Web-tool database and
          incorporated into the raw UB term data;
       -- verify that the workflow is completed and functions
          as intended for generating updated term documentation;
       -- update the "schema" of attributes used to describe 
          encoding schemes.

   2.4 TASK (who???): formulate a policy for pointing to
       non-DCMI URIs created for vocabularies to which DCMI
       URIs have already been assigned (sometimes called a
       "good-neighbor" policy) (see also A.11). This entails
       the following:

        -- describing a "DCMI philosophy" (or etiquette)
           for pointing to non-DCMI URIs (e.g., do we 
           "recommend" one over the other or simply point?);
        -- clarifying exactly where the non-DCMI URI will be
           recorded, and how that URI will be reflected in the
           DCMI Registry and exported for merging into the 
           raw UB data used for generating RDF schemas and 
           Web pages (see also 2.3).

    2.5 TASK (who???): clarify and document exactly how the 
        "good neighbor policy" (2.4) will be reflected in the 
        RDF schema and in the terms Web pages. (See also A.4
        and A.5). This entails:

        -- clarifying with Roland -- who I believe had a solution
           for this that was discussed and for which notes exist
           somewhere... -- exactly how the cross-reference would
           be expressed in RDF;
        -- clarifying with Tom exactly what relevant field or fields
           will be automatically exported as a basis for generating
           Web pages and RDF schemas;
        -- clarifying with Tom exactly how that additional information
           should appear in the Web documents;
        -- liaising with Harry to ensure that the additional RDF
           assertions will be generated from the raw UB data.

    2.6 TASK (Traugott): set up a JISCMAIL list to use as an
        archive of (all?) actions taken in the fast-track
        procedure (see also A.3 below). This entails:

        -- clarifying exactly who needs to do what to verify
           and evaluate a submission, what follow-up actions
           they need to take, what needs to be documented
           and where;
        -- providing a user-friendly list of actions and
           responsibilities for inclusion in the DCMI Usage
           Board process [UB-PROCESS] or in another appropriate
           document.

    2.7 TASK (Diane and Stuart?): Documentation of process
        for reviewers of fast-track proposals. This should
        expand on Section 5 of [UB-PROCESS] clarifying
        exactly who is expected to do what in order to reach a
        fast-track decision about a proposed VES. This could
        take a list of such actions and responsibilities
        from 2.6. (See also point A.1 below.)

3. Approve an initial set of encoding schemes

   When all of the above is in place, we somehow need to move
   this forward to the actual creation of VESes, especially
   for the initial set of known important vocabularies in
   the pipeline (see also point A.10).

   Note that the restriction that proposals be accepted only
   from the owners and maintainers of vocabularies is slightly
   at odds with the notion that we would begin with "known
   important vocabularies" that have long been in the pipeline.
   In other words, we should decide whether to go ahead with
   the registration of some vocabularies on our own initiative,
   and if so, who will take that initiative and to what extent
   will those volunteers be expected to obtain permission
   from the owner/maintainers of the vocabularies in question.

   Deciding how to proceed on this would be the responsibility
   of the overall project manager for Vocabulary Registration,
   in consultation with the Usage Board.

4. Manage the above tasks and representing the project to the public

    4.1 TASK (Traugott?): Create a Web page describing DCMI's
        project for registering VESes along with a one-stop
        annotated set of pointers to all relevant resources
        (such as [GUIDELINES], [UB-PROCESS], [NAMESPACE
        POLICY], DCMI terms documentation, and the DCMI
        Registry. This entails:

        -- taking as input from 2.4 a "good-neighbor policy";
        -- explaining overall philosophy, policy, and
           intentions (perhaps this should be where we explain
           that in the first instance, registration will be
           on the initiative of scheme owners -- i.e., the
           maintainer of the vocabulary in question does the
           registering by proposing an acronym for use in a
           DCMI-maintained URI and optionally supplying an
           owner-maintained URI for the same;
        -- creating short versions of the above for posting
           as announcements to DC-GENERAL.

        Such explanatory text could be folded into the start
        page for the Web tool [WEB-TOOL], which currently
        provides user guidance on using the tool, which would
        require coordination with Harry on editing a single
        Web page. Or the information could be split out into
        a separate document -- whatever seems friendliest
        for users. Either way, we should ensure that all
        Web pages of relevance to the project of registering
        encoding schemes be fully cross-referenced with all
        of the other relevant Web pages.

    4.2 TASK (Traugott?): Clarify who will actually do
        what. This task involves defining what needs to
        be done (see also 2.7), but also who is going to
        actually do it and the process for managing the
        people who are doing it. Note that there has been
        considerable discussion between Ithaca and Seattle
        of a "reasonable" role for Usage Board members in
        processing proposals. However, experience in last
        year's trial run and with AskDCMI this year strongly
        suggest that asking UB members to claim proposals
        to vet will be problematic.

    4.3 TASK (Traugott): In Seattle, Traugott volunteered to
        assume overall responsibility for the Registration
        issue. From my point of view, I see this as involving,
        among other things:

        -- coordinating and motivating the people who will
           check, evaluate, and approve the proposals for VESes;
        -- making announcements to DC-GENERAL;
        -- coordinating with Tom about responsibilities to avoid 
           unnecessary redundancies in tracking issues.

5. Plan a workshop

    Note: As of December 2003, Tom, Traugott, Stu Weibel, Diane,
    and Stuart Sutton are discussing the possibility of holding
    a workshop to coordinate better between DCMI and other
    communities interested in the general problem of identifying
    and citing controlled vocabularies. While it is related to the
    process of Vocabulary Encoding Scheme Registration discussed
    here, the workshop issue is not further covered below. Note
    that this venue would be the appropriate place to discuss a
    possible use of IETF's InfoURI (Appendix point A.9).

------------------------------------------------------------------------
APPENDIX A: Recent decisions and action items
------------------------------------------------------------------------

2003-06-17: Ithaca meeting
2003-09-28: Seattle meeting

A.1 ITHACA ACTION ITEM (Diane and Stuart): Make necessary updates
    to the UB Process document.

A.2 ITHACA ACTION ITEM (Tom): Ask Directorate to advise UB on
    their position in regard to the legal issues surrounding
    encoding scheme registration. [Note: this has been done, and
    the opinion is that we can go ahead as long as we articulate
    our policies clearly.]

A.3 ITHACA ACTION ITEM (Traugott): In the interest of
    maintaining an audit trail all emails between UB/DCMI
    and the scheme owner/maintainer to be sent to a closed
    Jiscmail DC list for permanent retention; Traugott to ask
    Paul Miller.

A.4 ITHACA DECISION (according to Traugott): DCMI assigns a URI
    and lists URIs created by vocabulary owners. If this is the
    case, a "same as" relationship between the two is declared.

A.5 ITHACA ACTION ITEM (Tom and Diane): Draft new document that
    explains such things as 'good neighbour' policy, what the
    process involves, the aims of the registration service,
    registration help, etc.

A.6 ITHACA ACTION ITEM (Traugott): List of priorities for
    enhancements/changes to the scheme registration tool to be
    submitted to Makx.

A.7 ITHACA ACTION ITEM (Tom): Document the XML output formats
    that are wanted from the registration tool.

A.8 SEATTLE DECISION: UB agrees that it must proceed with
    Vocabulary Encoding Scheme registration.

A.9 SEATTLE DECISION: UB will consider adopting IETF's InfoURI
    if and when this is finalized.

A.10 SEATTLE DECISION: The UB is to aim for implementation of
     the registry by January or February 2004.

A.11 SEATTLE DECISION: Where schemes have existing URIs such
     schemes should be registered at the request of implementers.

A.12 SEATTLE ACTION ITEM 13 (Traugott): Draft a document setting
     out guidelines for the creation of URIs for encoding
     schemes.

A.13 SEATTLE ACTION ITEM 14 (Tom): Tom to remind Stu to seek advice
     from OCLC lawyers regarding legal issues surrounding encoding
     scheme registration.

A.14 SEATTLE ACTION ITEM 15 (Traugott): Questions to be posed to
     Harry, as part of request for updates and changes, regarding
     authentication and whether existing authentication facility
     is robust enough.

------------------------------------------------------------------------
BIBLIOGRAPHY
------------------------------------------------------------------------

[GUIDELINES] Guidelines for Vocabulary and Encoding Scheme Qualifiers,
   http://dublincore.org/usage/documents/vocabulary-guidelines/

[NAMESPACE-POLICY] DCMI Namespace Policy,
   http://dublincore.org/documents/dcmi-namespace/.
       
[DCMI-PRINCIPLES] DCMI Grammatical Principles,
   http://dublincore.org/usage/documents/principles/.

[UB-PROCESS] DCMI Usage Board Process,
   http://dublincore.org/usage/documents/process/

[WEB-TOOL] Vocabulary Scheme Registration [a Web-based tool],
   http://wip.dublincore.org/schemes/index.html.