Exchange formats: PICA III

<< Click here to display Table of Contents >> Navigation: Axiell Designer > Import and export > Exchange formats:

PICA III

Contents

Many libraries, particularly in the Netherlands and Germany, use PICA (Project-Integrated Catalogue Automation) GGC (Shared Automated Catalogue System) for cataloguing their collection.

File layout

PICA is built up out of so called KMC codes. A PICA tag, or KMC code, consists of four digits that indicate the beginning of a certain field. Each separate field has its own KMC code. Example:

SET: S2 [167] TTL: 161 PN: 080752306 PAG: 01N

0200:1730:11-04-91 0210:1006:28-10-93 12:02:23 0230:9999:99-99-99

0500 Abx
1100 1989 $ 1989-...
1500 /1eng
1700 /1uk
1800 f
3121 !095898263!@ Dutch Centre for Public Libraries and Literature, The Hague. Audiovisual Media Department.
3141 <10>@Info-AVM
4000 @Information on audiovisual media / [$3121]
4025 Issue 164 (1989) - 167 (1989) ; vol.15 (1990) + ...
4030 The Hague : NBLC
4062 30 cm
4208 Appears 10x a year, issues 5 and 10 are cumulative
7001 25-09-92 : y1vg
4220 /b1991-
7100 0709#019 Inf @ f
7800 148393950

Where:

0500	a KMC code of which the 1st letter determines the type of document (e.g. A=avm, B=book, E=microform, M=printed music, S=software), the 2nd letter determines whether the document is monographic or serial, and the 3rd position signifies the status of the catalogue entry
1100	KMC for the year of publication
1500	language in which the document is published (/1 = language in which it is written, /3 = language from which it was translated)
1700	country of publication (/1 = country of publication, /2 = country of 2nd imprint)
1800	frequency of publication (a = daily, c = weekly, f = monthly)
30##	authors
31##	corporate authors
4000	title space followed by an '/' and author’s statement
402#	edition / issue
403#	imprint (place of publication : publisher)
4062	format
42##	annotation
70##	copy details
71##	shelf mark
7800	unique serial number allocated by PICA

In the record, an at-sign (@) indicates on which word, sorting is to take place.
A field can contain a reference to another field. This is indicated by a dollar sign immediately followed by the tag of the field.

The above KMCs only represent a small number of the total range. Most organizations/libraries only use a selection of KMCs which are relevant to that organization. A PICA record consists of a part with general catalogue entry details that are the same for all organizations (e.g. title, author, imprint), and details that pertain to a particular organization, (e.g. shelf mark, copy details, local classifications and/or subject terms). Agreements are usually made between PICA and the organization, about which KMCs should and should not be used, and which data is stored where and in what form.

Conversion

Because different KMCs are used in different organizations, a routine for importing PICA records into Adlib is always a matter of custom work. Conversion takes place in two phases:

Phase one: downloading PICA GGC to Extended ASCII

The first phase for importing PICA data into Adlib is that of selecting the records in GGC and downloading them to an Extended ASCII file consisting of records built up according to the above example. The record header plays an important role in the importing of records into Adlib, because:

it contains the Pica Production Number (PPN). The PPN is used in the Adlib import procedure as a relational operator/update field.
The Pica Production Number (PPN) is saved in the Adlib database named "catalo" in tag/index "pi". This tag is defined in the Adlib import job as a so-called update tag. During the import operation, Adlib checks whether the PPN number already exists somewhere in the Adlib database. If so, all mapped data in the record is overwritten with the new values. If not, a new record is added;

2.	it serves as a separator between PICA records.

To convert the diacritical characters in the PICA character set (PICA document: DE 007/0195) to Extended ASCII, there must be a conversion file called user2.txt on the PICA workstation in the IBW3-subdirectory.

An IBW workstation already has a USER2 download option. This, however, cannot be used for the correct translation of diacritical characters: during downloading, a text file/conversion table must be used, which is based on another file named picaibm.txt. The text file (e.g. picaibm.txt, user2.txt) must be compiled after processing into a PICA program file named user2.cmp. This can be done using the PICA command-line program charconv.exe as follows:

charconv table.txt user2.cmp

table.txt	the text file with the conversion table that is based on picaibm.txt
user2.cmp	the PICA-IBW conversion program that translates diacritical PICA characters to Extended ASCII while records are being downloaded from PICA

The text file with the conversion table is divided into two columns. The first column corresponds with the decimal values in the PICA character set, and the second column contains the ASCII values of the diacritical character. A double number in the first column indicates that there are two characters in the GGC: e.g. 225 101 := 138 (meaning: `e, which is to be è. If there is a conversion key in incorrect syntax in the text file, the error message: macro not defined will be displayed after compilation with charconv.exe. Conversion keys followed by NOTDEFINED are not defined in Extended ASCII.

If all diacritical marks are defined in accordance with the above mentioned method, no further processing will be required before the data in Adlib can be imported. If that is not possible, the program ACCENT.exe can be used to convert the most common diacritics. The syntax of ACCENT.exe is:

accent file1.dnl file2.adl

file1.dnl	the text file with the PICA download
file2.adl	the converted file to be imported into Adlib

Phase two: uploading PICA records into Adlib.

To upload records in Adlib, you need the following:

•	An Adlib import job (e.g. pica.imp), which establishes: - General database options; - Type of import file: PICA Download; - Name of PICA import adapl.

•	A tag mapping of KMCs to corresponding Adlib tags. Such a list may include e.g. the following fields: - Name of relational operator/update field: pi.

310#	au
4000	ti
PPN	pi

A KMC may be followed by a space and either NO-EXPANSION or RAW, like so:

34XX NO-EXPANSION	%H
4004 RAW	%1
PPN	pi

RAW means that the entire contents of a field will be imported. This is relevant if the field contains Pica numbers (in between two exclamation marks): if you import a field RAW, the Pica number won't be stripped from the field contents as it would be by default otherwise. Normally, such Pica numbers have no relevance in Adlib records , so the standard Pica III import strips them away from the data, but if you'd still like to have the old reference to the Pica number with the actual field data, you can have it by importing the field RAW.
NO-EXPANSION means that of a lemma containing a Pica number only the part to the right of the number will be imported, while of a lemma without Pica number the entire string will be imported: so the right part of the data will be imported if the lemma has the xxxxx![0-9X]!yyyyy syntax (in which case yyyyy remains), while the original string (xxxxxyyyyy) will be imported if it doesn't contain a number. Importing the same field without NO-EXPANSION means that always the entire string without any Pica number will be imported.

•

An import procedure (an ADAPL program) that runs during the importing of the PICA records and processes the contents of the various fields (after the mapping has been applied and any Pica numbers have been stripped off if applicable) so that they are suitable for entering in Adlib.
This import adapl will be custom made for each application because account has to be taken of local modifications. Because of the large number of fields in a PICA download and the time-consuming job of data conversion, such adapls can be quite complex. This requires that the system administrator has a thorough knowledge of programming in ADAPL. The situation is even more complex if the Adlib Loans module is used. In addition to catalogue records, copy records then also have to be created for each separate copy in the PICA download file.