Review of the SWAP Description Set Profile

Description Templates for the entities of the Domain Model

[WWW]In draft DC-DSP spec Caption in [WWW]SWAP
5.5. Resource Class Membership Constraint See explanation below
5.1. Identifier Description

According to the review criteria, "the header or introduction of a Description Template" should provide one piece of mandatory information: "the class (or classes) of which resources described in this description may be an instance".

In the Scholarly Works Application Profile, the description templates are clearly marked with section headings such as "Description of the eprint as a Scholarly Work". As explained in the section "Entity typing" (near the end of the document), the Descriptions are explicitly "typed" using dc:type statements with one of the value URIs taken from the Eprints Entity Type Vocabulary Encoding Scheme:

This constraint corresponds to [WWW]"5.5. Resource Class Membership Constraint" in the draft "Description Set Profiles" specification. In the [WWW]XML expression of SWAP, the constraint is expressed with the XML element "ResourceClass".

The constraint [WWW]"5.1. Identifier" -- "A string that can be used in a Value Constraint to reference a description template that applies to the value resource." -- is used in several statement templates. For example, in the description template for the property dc:creator, the Identifier constraint is labelled "Description: agent". The nature and function of this constraint is not clear unless one consults the [WWW]XML expression of SWAP), where the "Description" constraint is expressed with the XML element "descriptionTemplateID", which is itself not explicitly defined in the draft [WWW]DSP specification.

Statement Templates within a Description Template

[WWW]In draft DC-DSP spec Caption in [WWW]SWAP
6.1. Minimum occurrence constraint Min occurrence
6.2. Maximum occurrence constraint Max occurrence
6.3. Type constraint Literal?
6.4.1. Property List Constraint Property

The two mandatory constraints (6.3. Type Constraint and 6.4.1. Property List Constraint) are provided in all cases. In some cases, maximum and minimum times that the given kind of Statement may appear in the enclosing Description are also provided.

The following property constraints are given

http://purl.org/dc/elements/1.1/type
http://purl.org/dc/elements/1.1/title
http://purl.org/dc/elements/1.1/subject
http://purl.org/dc/terms/abstract
http://purl.org/dc/elements/1.1/identifier
http://purl.org/dc/elements/1.1/creator
http://www.loc.gov/loc.terms/relators/FND
http://purl.org/eprint/terms/grantNumber
http://www.loc.gov/loc.terms/relators/THS
http://purl.org/eprint/terms/affiliatedInstitution
http://purl.org/eprint/terms/hasAdaptation
http://purl.org/eprint/terms/isExpressedAs
http://purl.org/dc/elements/1.1/type
http://purl.org/dc/elements/1.1/title
http://purl.org/dc/elements/1.1/description
http://purl.org/dc/elements/1.1/identifier
http://purl.org/dc/terms/available
http://purl.org/eprint/terms/status
http://purl.org/eprint/terms/version
http://purl.org/dc/elements/1.1/language
http://purl.org/dc/elements/1.1/type
http://purl.org/eprint/terms/copyrightHolder
http://purl.org/dc/terms/hasVersion
http://purl.org/eprint/terms/hasTranslation
http://purl.org/dc/terms/bibliographicCitation
http://purl.org/dc/terms/references
http://www.loc.gov/loc.terms/relators/EDT
http://purl.org/eprint/terms/isManifestedAs
http://purl.org/dc/elements/1.1/type
http://purl.org/dc/elements/1.1/format
http://purl.org/dc/terms/modified
http://purl.org/dc/elements/1.1/publisher
http://purl.org/eprint/terms/isAvailableAs
http://purl.org/dc/elements/1.1/type
http://purl.org/dc/terms/accessRights
http://purl.org/dc/terms/license
http://purl.org/dc/terms/available
http://purl.org/dc/terms/isPartOf
http://purl.org/dc/elements/1.1/type
http://xmlns.com/foaf/0.1/name
http://xmlns.com/foaf/0.1/family_name
http://xmlns.com/foaf/0.1/givenname
http://xmlns.com/foaf/0.1/workplaceHomepage
http://xmlns.com/foaf/0.1/mbox
http://xmlns.com/foaf/0.1/homepage

Statement Templates defining Literal Value Constraints

[WWW]In draft DC-DSP spec Caption in [WWW]SWAP
6.5.2. Literal language constraint Language constraint - Occurrence
6.5.4. SES constraint SES constraint - Occurrence
6.5.5. SES list constraint SES constraint - Choose from

Three of the five optional constraints defined for Literal Value Surrogates are used -- correctly and consistently, as far as the reviewer can see.

Statement Templates defining Non-Literal Value Constraints

[WWW]In draft DC-DSP spec Caption in [WWW]SWAP
6.3.1. Value URI constraint Value URI constraint - Occurrence
6.6.3.2. Value URI List Constraint Value URI constraint - Choose from
6.6.4.1. VES list constraint VES constraint - Choose from
6.6.4.1. VES occurrence constraint VES constraint - Occurrence
6.6.5.2. Maximum occurrence constraint Value string constraint - Max occur

Five of the possible constraints defined for Non-Literal Value Surrogates are used -- correctly and consistently, as far as the reviewer can see.

Comments

----------------------------------------------------------------------
2008-09-11 Pete comments

> I have written up my review results on a new wiki page [1].
> Please note in particular the comments at the end.  Please
> feel free to fold your points directly into this wiki page.

On description templates,

===
The constraint [WWW]"5.1. Identifier" -- "A string that can be used in a
Value Constraint to reference a description template that applies to the
value resource." -- is used in several statement templates.
===

I think this needs to be a bit clearer.

5.1 applies to Description Templates - and isn't itself really a
constraint, I don't think: providing a DT identifier in a DSP doesn't in
itself provde any sort of constraint on a description set; it's the
subsequent reference to this identifier in a constraint within a
Statement Template which creates the constraint.

In the Statement Template, the constraint being used is 6.6.1
Description template reference (and this is a constraint). So I think
this text needs to be clearer what is being referred to, and if it's the
latter, then it belongs in the discussion of Statement Templates.

===
For example, in the description template for the property dc:creator,
the Identifier constraint is labelled "Description: agent".
===

This is definitely 6.6.1 Description template reference

===
The nature and function of this constraint is not clear unless one
consults the [WWW]XML expression of SWAP), where the "Description"
constraint is expressed with the XML element "descriptionTemplateID",
which is itself not explicitly defined in the draft [WWW]DSP
specification.
===

OK, I know we can only review what is in front of us, but to be fair on
the authors, this XML error is an error in the Wiki macro/plugin thing
(which generates the XML).

It's also the result of the Wiki macro/plugin behaviour that - AFAICT -
none of the Description Template data is made visible in the
human-readable text

I think we kinda need to find some way of acknowledging that in this
case some of the presentation of the document is out of the control of
the author and is determined by the Wiki plugin.

And the Wiki plugin might be improved to e.g. HTML hyperlink a 6.6.1
Description template reference to the referenced DT

On statement templates....

As I think I mentioned on the telecon, I think the current UB criterion
that a "6.4.1. Property List Constraint" is mandatory is too strong. A
DSP can provide a "6.4.2. Sub-property constraint" (and I'm probably
going to do that in one I'm working on now).

My main point is on

===
The property templates for "Type" (under "Expression") and "Licence" say
that "recommended best practice is to provide a value URI for a class
from the Eprints Type Vocabulary Encoding Scheme. However, the formal
Value URI constraints force the user to choose one of the defined types,
and Value Strings are disallowed. In this case, the formal constraints
are stronger than the constraints expressed in natural language.
===

I think there is a more fundamental problem with the STs which reference
the dc:type property in the Expression DT. There are two STs
(Expression/Entity Type and Expression/Type) with the same property list
constraint (list of one member, dc:type).

The first ST (Expression/Entity Type) says a statement must use a value
URI and that URI must be http://purl.org/eprint/entityType/Expression/
(which actually shouldn't have a terminal slash, but that isn't the
point I'm making!).

The second ST (Expression/Type) says a statement must use a value URI
and that URI must be one of a list (not including the URI above)

But this creates a problem for the matching algorithm. I'm fairly sure
the intent is that after matching up the description template, then it
uses the property constraint to select a statement template. And there
must be a match on exactly one statement template i.e. the DSP draft
says

===
Binding of statements to statement templates
    For each description, each statement is bound to a Statement
Template in the corresponding Description Template by evaluating the
Property Constraint. Each statement must be bound to exactly one
===

But here, within one DT, there are two different STs using the _same_
property list constraint (list of one member, dc:type). So in a
description of an Expression, any statement using dc:type is going to
match up on two STs, which I don't think is permitted.

To be sure, we proably need to check again with Mikael how the matching
algorithm is supposed to work, but I think I'm right in saying there's
an issue here, because we discussed it for the case of "how do we allow
dc:subject with a literal "tag" value and with a specified VES?")

(The solution is to use two different properties e.g. dc:type in one
case and rdf:type in the other, or coin a new subproperty)

And there prob needs to be another point in the criteria to say "Are the
STs and DTs defined so as to be matchable?" (in a better form of words
than that!)

----------------------------------------------------------------------
2008-09-11 Joe comments

Do the constraints presented in the Description Templates and
Statement Templates reflect the content of the domain model?

Is Creator an agent?  In other words how does it relate to
agent in the model?

Is an Editor a type of Creator?  Is Editor an agent?

Status says that when used a Value is mandatory.  Why is it
"recommended best practice" if it is required?

Are the constraints presented in the Statement Templates
consistent with the definition of property provided by
its owner?

In the "Eprint-specific recommendation" for Creator you state
that implementers are to provide name or URI and/or a link to a
related description - is this consistent with the definition?
Is a related description the Creator?  Should it be "and if
available a link to a related description about the author"

Property constraint declared with a literal or non-literal range?
Did not check

For "annotations": Is the recommended use of the term consistent
with the definition provided by the term owner? See question
about Creator.

For "annotations": Is the usage of these terms in the
description set profile consistent with the declared semantics?
See comment on related description

----------------------------------------------------------------------
2008-09-15 Pete

I'm looking at SWAP for other reasons, and just noticed one thing which
I think is an error...

In the Expression DT, there's an ST for bibliographic citation, which is
specified to take literal values, which is consistent with the range of
the property.

But the DC-Text example provided has a single statement with a literal
value surrogate with two value strings, one plain text, one an XML
literal. Which you can't do with a literal value surrogate. It should
use repeated statements each with a single value string.

I think this is a result of SWAP being designed on the basis of old DCAM
and then partly updated to new DCAM and various things being missed.