The 4th Dublin Core Workshop

(DC Down Under) 3-5 March 1997, Canberra, Australia

Notes from UK participants

Rachel Heery, Paul Miller, Tony Gill, Dave Beckett

These are our personal views of the workshop and its outcomes and are in no way meant to represent or pre-empt the official workshop report which will be drawn up by Stu Weibel (OCLC). Discussion of the workshop and information on the official report will be on the meta2 mailing list.

1. Overview

Main items on agenda:

Extensibility

Qualifiers and element sub-structure

Element set refinements: coverage, relation, rights

Related developments: PICS

Attendees:

22 North Americans

20 Australians

16 European

7 Asian

Made up of:

25 Librarians

25 Networking

15 Content specialists

2 Extensibility

2.1 Presentation on extensibility

by John Kunze and Carl Lagoze

Goals for DC:

Enables higher precision indexing than full text
Easy to create
Simple to index
Interoperability
Extensibility

Extensibility means using extra semantics with existing element sets. The motivation is that local communities need extra elements but that these communities should be able to build on DC effort not be forced to re-invent a record structure. They should be able to add extensions of local significance.

Sets of elements are defined by:

data structure

vocabulary

vocabulary plus rules

Sets will interact and different sets will evolve separately. Issues include:

overlap

inheritance

conflict

Two models for extensibility were identified:

containment DC existing as part of a wider element set

co-existing element sets DC exists as a separate set

Three kinds of extensions were identified:

local elements (create new elements for local use)

qualifiers (use qualifier mechanism)

alternate element definitions (have alternate definitions for the same element)

2.2 Discussion of extensibility

Required for:

experiments and local extensions
complementary data content standards (e.g. rights management, archival)
richer data content standards (e.g. MARC, FGDC)

2.3 Agreement

Extensions should be allowed following the W3C agreed syntax. So if for example Harvard wanted to introduce new elements they would be called HVD.element name or if there were UK specific names they would be UK.element name

3. Qualifiers and element sub-structure

3.1 Break out groups

Aim was to identify major issues of contention in areas of element sub-structure. List of issues included:

Should there be a core set of qualifiers with further community specific ones?
Should there be a central registry or distributed registries (how will communities control their local elements/qualifiers?)
How much control is possible/ desirable?
How to discourage proliferation?
What will be the syntax for identifying community wide set of elements/qualifiers?

3.2 Discussion of qualifiers and element sub-structures

It is important to distinguish the two sorts of qualifier:

scheme=name of scheme, and

type= refinement of element definition

There was general agreement that using qualifiers (in particular type=) might all too easily result in the creation of new elements. Although local practice might need new elements, there is a clear tension between use of qualifiers and interoperability. It was accepted that qualifiers had been introduced to reach consensus, but that we need to be upfront about the detrimental effect on interoperability. One of the main divisions among participants were the minimalists who wanted no qualifiers and the 'extenders' who were happy to use qualifiers. There were not many implementors among the minimalists, and it was accepted that people would use some qualifiers in real life.

There was a suggestion that sub-elements are permitted only for data upon which a researcher "may wish to search" -- date created, date modified etc. Telephone number etc are not normally searched upon and would be better dealt with as non-Dc elements (extensions).

3.3 Agreements on qualifiers and element sub-structures

There was general agreement that the DC record must be meaningful without inclusion of qualifiers. This principle would seem in general to permit use of scheme= qualification as this helps consistency and the automated interpretation of values For example if one had the following

Subject scheme=DDC content=310.1

then one would interpret numbers in this field as some sort of classification, even though you would not be able to say which scheme. Similarly in the case of

Subject scheme=LCSH content=Australian mammals

one would find the terms useful even if you were unaware they belonged to a controlled vocabulary.

(Although this was accepted the full impact of this principle was contradicted by many suggestions, and proposed practices from the floor.)

It was agreed that the qualifier ROLE would be abandoned. this was introduced mistakenly and is in effect synonymous with TYPE. It will be dropped.

It was agreed that each element could (and some would say should) be qualified by language.

It was agreed that schemes must help in interpretation of values be based on external standards

It was agreed as a principle that types should narrow the semantics of an element. There was some feeling that this would help to ensure the element could still be usefully interpreted without the qualifier. It was acknowledged that it was difficult to agree whether a particular qualifier narrows or extends semantics, but there was agreement that it would be useful to keep to the spirit of this principle.

4. Syntax

The aim was to pin down the DC syntax. The discussion took into account a proposal for changing HTML guidelines which is on the table at W3C: the Web Collections syntax.

A flavour of this proposal was outlined. It is designed to allow the relation between different web pages to be

identified. It was presented as a mechanism supported by large vendors and was roughly as follows

<DATA PROFILE=http://www.DC.definition

<INFO NAME= LANG= SCHEME= CONTENT=

<DATA

<INFO NAME= CONTENT=

</DATA>

< INFO NAME= CONTENT=

</DATA>

This would enable cleaner expression of DC and would allow for simpler statements of types/schemes/language. The DATA wrapper links to the DC definition, and allows the dropping of DC. from each element name. The problems of ordering does not appear clear...

Also, how are external schemes LINKed...? -- the <DATA> </DATA> tags may be nested, so an element using IMT, for example, would sit inside a second <DATA> </DATA> set, with the PROFILE being the IMT RFC. The 'higher' DC profile is simply inherited.

The general feeling was that this proposal or similar was likely to be accepted by W3C, and that the DC community should lobby W3C to ensure that our requirements were taken into account. Caveat: there was an undercurrent of opinion that this proposal was being hyped and that in reality it might not be endorsed by W3C as quickly as suggested.

Other discussion on syntax assumed that encoding recommended now should be viewed as an interim measure. Some people suggested it wasn't worth creating metadata until the W3C pronouncement came to pass....... but this caused near apoplexy among Northern Europeans!

There were two proposals that Misha Wolf (with LiamQuinn, Eric Miller and Dave Beckett) presented:

4.1 Element name and type qualifier

a) <META NAME=DC.author CONTENT="(TYPE=email)foo@bar.co.uk">

This is current practice amongst early implementors.

b) <META NAME=DC.author.email CONTENT="foo@bar.co.uk">

Labelled the dot mechanism. People thought this syntax reflected the fact that adding a type qualifier was in effect creating a new element.

And the consensus was that b) was accepted: type should append to the element name.

4.3 Where do scheme and language qualifiers go?

This proposal (b) was thought to be moving towards the Info/Data Web Collections syntax and was considered cleaner. But it is not compliant with current HTML.

There was strong disagreement so the recommendation is that both a and b are options that can be used with the type (if present) appended to the element name in the NAME attribute (as shown in 4.2).

5. Break out groups on key areas

5.1 Recommended defaults for minimalist DC

5.2 How COVERAGE will be used

This group took the newly modified dot notation (mentioned above) and used it to greatly extend the power of COVERAGE. The previous TYPEs of 'spatial' and 'temporal' have been dropped altogether, and replaced by;

DC.coverage.x

DC.coverage.y

DC.coverage.z

DC.coverage.t

DC.coverage.poly

DC.coverage.line

x, y, z, and t are further qualifiable by .min or .max and then a SINGLE value.

SCHEMEs are used as normal to define various datum values etc, and include OSGB (of course!), and the replacement of the (fairly senseless, really) 'LATLONG' with DMS (lat and long, expressed as degrees, mins and secs) and DD (decimal degrees).

5.3 Multilingual DC

(Draft Paper available from Tom Baker)

5.4 Registries

6. Future actions agreed

Submit present draft RFC and submit the syntax appendix as separate RFC so that can evolve on its own path. (Tony Gill, Dave Beckett, and Paul Miller will contribute to this work)

Draft further RFC's on :

minimalist set
coverage
multilingual stuff
user guide

Working groups on

registries
relation
coverage
dates
non-English DC
Implementors group (re-open HTML DC implementors mailing list)

7. National Metadata Seminar, 6 March, National Library of Australia

This was attended by 280+ people from all over Australia. There were a lot of government people (it was in Canberra!) and higher education people. This was the largest number who had ever attended a meeting at the National Library, metadata is just that popular...

7.1 PICS

Presentation by Philip DesAutels affiliated to W3C at National Library Seminar

Philip suggested PICS as the future infrastructure for associating metadata with internet content. He promoted it as a simple architecturefor describing and transporting metadata. It would support various metadata schemes with distributed schema registration. This distribution could be at the level of user, server, service provider, firewall, search service etc This might move filtering from the browser to search engines, proxies or firewalls. On questioning Philip stated that where 'the registry' is in relation to the user/server could significantly impact performance.

7.2 Metadata developments in eLib and European projects

Rachel Heery gave a summary of some of themes related to metadata emerging in the projects with which UKOLN is involved: ROADS, NewsAgent, DESIRE and BIBLINK.

7.3 Other presentations

Stu Weibel on DC, Carl Lagoze on re-thinking metadata, John Perkins on CIMI, Eliot Christian on GILS, Renato Iannella on Australian metadata projects, Rebecca Guenther on USMARC.

Last updated: 20-Mar-1997