Exchanging data: introduction
It is unlikely that you or your co-workers will always enter all the data in Collections records manually. Especially when you are starting out with Collections, you can save a great deal of time by using any existing databases that were created before in other software and/or by another organization. This particularly applies to catalogue information. In such cases you would like to import those existing databases into your Collections system. This is called importing data.
Easier said than done, however. There are numerous ways to organize and code a database, and virtually every database program has its own method of doing so. Such a method is known as a database format. In addition, each database has its own layout for fields. The number of possible field layouts is enormous, particularly because every user of a database program can decide how he or she wants the layout.
Because of this enormous variety in database formats, you must often use a so-called exchange file to be able to exchange data between database formats. This file is the result of exporting data from some source database (program) to a certain exchange file format, which the target database (program) is able to import. Exchange file formats are standardized, and are usually simple formats which most database programs can export to or import from. The advantage of using exchange files is that there are only a limited number of them, that the format is completely known, and that they are usually text files that can be opened and/or edited in any text editor.
Via different Axiell tools one or more of the following exchange file formats can be imported (and converted to AdlibXML at the same time) into the Collections database format:
• | (Adlib) Tagged (Axiell Designer/import.exe, Collections, ImportTool) |
• | ASCII delimited (CSV) (Axiell Designer/import.exe, Collections, ImportTool) |
• | AdlibXML (Axiell Designer/import.exe (unstructured XML only), ImportTool (grouped or unstructured XML)) |
You can also import similar formats, if just the field or record separators used in the exchange file are different from the standard.
The Axiell Designer import functionality and import.exe - they share the same code - also support the following old formats: DBASE III/IV (*.dbf), ASCII fixed length, PICA III, MARC (general ISO 2709), MARC (Ocelot), MARC (CDS-Isis) and Image directory: please contact our helpdesk for more information if you need an import of one of these formats. However, this import code won't be developed any further and does not support new database table structures like the indexed link and full text index types, the triple index type for reverse relation metadata and the metadata database table themselves. If the desired format and complexity of the import or export job allow for it, we therefore recommend using ImportTool, ExportTool or Collections, as these tools do support the new table structures
With the Export functionality in Collections, you can export to CSV, AdlibXML (grouped), AdlibXML (unstructured), Calm XML, DScribe Natural, Tagged and Attached media and from version 1.20, custom XSLT export formats (optionally combined with an adapl to preprocess data before the XSLT stylesheet takes over) are supported too.
Using ExportTool.exe you can export to Tagged, CSV or AdlibXml (Grouped).
An export job defined in Designer though, will always export to the Tagged exchange file format.
And then there's also the issue of the character set in which database and exchange file are encoded:
• | For import, Designer automatically recognizes UTF-8 (however, not the UTF-8-BOM variety that Microsoft software may create!) and Unicode (big endian as well as little endian) exchange files and imports the data from them, regardless of the type of the target database. For ANSI and DOS databases the following applies: if there are non-importable Unicode characters in the exchange file, they will be replaced by "?" (a question mark). |
• | Designer exports data from DOS, ANSI or UTF-8 encoded Unicode databases to exchange files in UTF-8. |
Although the software does quite a lot of the conversion work for you, you will have to take care of a number of things yourself. For instance, you will have to determine the target Collections fields in which each field from the source database in the exchange file must be imported. And if different types of data are packed in a single field in the exchange file, you may want to "unpack" these sub fields with an ADAPL program (which must be written first). These and other options must be set in what's called an import job (not for Collections though). This is a settings file with the extension .imp. To actually start importing, you must run the import job. Similarly, you must set a few options in an export job (an .exp file), and run it, to actually start exporting data (not for Collections though).
Creating, editing, saving, deleting or running import and export jobs can be done in the Import/Export job manager and the Import/Export job editor.
When you are about to import multiple exchange files in as many database tables, you'll have to think about the order in which to import them, as this is not a trivial matter. A typical import of multiple exchange files (orginally exported from Collections database tables) in as many empty target Collections database tables with the same record number ranges, would involve the following:
1. | First import the Thesaurus and Persons and institutions exchange files whilst only mapping the term field and keeping the record numbers identical. Then import the same Thesaurus and Persons and institutions exchange files again, with a full tag mapping this time, with the term field as the update field, without processing links and without creating new records. This is to prevent that Designer will try to create internal links during the first import run, because that would logically fail: the internal links must point to records which may still have to be imported, yet you don't want Designer to create new internally linked records with different record numbers from the ones you are importing. So the first import run creates all basic records and the second run will write the internal links in the basic records. |
2. | Only then import the catalogues, without processing links, whilst keeping the record numbers identical. As long as the processing of links is off, no new linked records will be created, so the order of importing, for instance, the photo and catalo databases (which are linked) doesn't matter. |
Exporting and importing also offers you some related functionality, like:
• | Instead of importing an exchange file, you can setup an import job so that it only reindexes all the indexes in the Collections target database table. See the Help topic on the Import database file option on the Advanced tab of the import job properties. |
• | You can convert DOS, ANSI, UNICODE or OCLC encoded records from an exchange file to the proper (wider) character set of the target database (Collections typically uses UTF-8), during import. And there are also possibilities to convert UTF-8 encoded export files to another encoding. See the Mapping Help topic for more information. |
See also
Accessing the job managers and editors
Managing import and export jobs
Running import and export jobs