MARC.pm - MARC Processing Module for Perl
=========================================

by J.P.Knight (J.P.Knight@lut.ac.uk)

1. Introduction
---------------

MARC stands for MAchine Readable Catalog and is a metadata format that
is widely used in the library community.  Most commercial library
automation systems can read and/or write this format.  This Perl module is
intented to allow Perl programmers to quickly and easily generate scripts
that can handle MARC record input and output.  Possible uses for this
module are to provide conversions to and from other metadata formats and 
converting database output into MARC format for use with library OPACs or
Z39.50 front ends.

Note that this is currently an alpha release of the Module; see the Future
Plans section for details of other things that are intended to be added.

2. Installation
---------------

To install this module, unpack the tar archive, change nto the Metadata
directory and then type:

   perl Makefile.PL

Once this has run type:

   make install

to install the module into your machine's Perl distribution.  Note that you
need to be able to write the Perl installation directory.  If you can't, then
you can still use this Module either by getting your system administrator to
install it for you, or by simply moving the whole of the Metadata directory
(the directory that contains this file) into the directory that you wish to
develop and run your scripts from.

3. Using the Module
-------------------

Seeing as this is an alpha release, documentation is a little scant at the
moment.  To make use of this module include this line:

  use Metadata::Marc;

somewhere near the top.  You can then make use of two functions to read
and write MARC records.  These functions are called ReadMarcRecord and
WriteMarcRecord respectively.  They both act on a Perl "elaborate"
composite data structure called MarcRecord.  The structure of MarcRecord
is:

$MarcRecord = {
  marc_type => "BLCMP",				# Type of Marc Record.
  status => substr($Leader,5,1),		# Status from Leader.
  type => substr($Leader,6,1),			# Type from Leader.
  class => substr($Leader,7,1),			# Class from Leader.
  indicator_count => substr($Leader,10,1),	# Indicator count from Leader.
  subfield_mark_count => substr($Leader,11,1),	# Subfield mark count
						# from Leader.
  encoding_level => substr($Leader,17,1),	# Encoding level from Leader.
  analytical_record_ind => substr($Leader,18,1),# Analytical record
						# indicator from Leader.
  source_of_record => substr($Leader,19,1),	# Source of record from Leader.
  on_union_flag => substr($Leader,20,1),	# On Union flag from Leader.
  scp_length => substr($Leader,21,1),		# SCP Length from Leader.
  general_record_des => substr($Leader,23,1),	# General Record
						# Description from Leader.
  data => { %FIELD },				# Actual MARC data fields.
  level => { %DIR_LEVEL },			# Levels of each MARC
						# data field.
};

The marc_type field of the data structure tells you if a valid Marc
record was read in and what sort of MARC record it was.  In the initial
Alpha release the Module only knows about BLCMP MARC format, so the only
valid values for this field are currently "Invalid" (not able to read in
a Leader), "BadDirectory" (the directory section of the MARC record is
corrupt) or "BLCMP" (a valid BLCMP MARC record has been read).  Apart
from the data and level fields, the rest of the fields in the MarcRecord
data structure are from the Marc record Leader.  The Leader is the
initial portion of the MARC record that tells you something about how
to interpret the information you find in the other parts of the record
and where the record came from.

The data and level fields of the MarcRecord data structure are the
interesting bits!  The data field holds a hash array of arrays that
actually holds the bibliographic data.  The hash key is the MARC tag
number and the array index is used to allow multiple fields with the
same tag number to be handled.  The array indexes start at zero and
increase monotomically.  Note that the module does not _currently_ break
down the field into its constuent subfields; you'll need to break it up
based on the \x1F subfield delimiters.

The level field is used to deal with analytical works in BLCMP mark and
appears in the MARC record's directory block for each field.  The level
is defined as zero for non-analytical works or if the field relates to
the work as a whole.  Fields describing the first analysed record will
have a level of 1, the next one a level of 2 and so on.

The ReadMarcRecord takes a single argument which is the filehandle (note
filehandle, not filename) of the input stream to read the Marc record
on.  It returns the MarcRecord data structure.  The WriteMarcRecord
takes a filehandle and a MarcRecord data structure as its arguments and
writes out the properly formatted MARC record on the specified output
stream.

5. Future Plans
---------------

As has been said a couple of times, this is currently an alpha release
of the software.  Planned future enhancements to this module include:

  * Ability to read and write more than just BLCMP MARC and
differentiate automatically between different MARC formats (initially
UKMARC and USMARC but maybe more if people provide suggestions or,
better yet, code).

  * Split up fields into separate subfields before passing it to the
user's code.

  * Improve the documentation.

  * Include a demonstration script that converts Dublin Core Element Set
embedded in HTML 2.0 documents into a MARC record.

6. Bugs, Patches and Comments
-----------------------------

Please send any bug reports, patches or comments to
J.P.Knight@lut.ac.uk.  Please make sure that you include a description
of the problem, the version of Perl that you are using and the version
of this module that you are using.

7. Copyright
------------

     Copyright (c) 1997 J.P.Knight@lut.ac.uk. All rights reserved.
     This program is free software; you can redistribute it and/or
     modify it under the same terms as Perl itself.  This module
     was developed partly under funding from the UK Electronic
     Libraries programme (see <URL:http://www.ukoln.ac.uk/elib/>) 
     and the EC DESIRE project.

     This software is provided "as is" and there are no warranties,
     implied or otherwise, about this software's fitness for 
     purpose.  You use it entirely at your own risk.

8. Changes
----------

Version 0.01 release on Sun Jan 12 20:58:49 GMT 1997
	- First alpha release.

