Indexes
Indexes are tables which hold either actual data (copied from a database table for this purpose) or reference numbers to the original data in a linked database, both from one or more specific fields, plus the record number of the record in which the indexed value occurs. An indexed value appears as many times in the index as it occurs in the relevant field(s) in records.
The purpose of an index is to be able to search an often used field more quickly than would be possible when searching through the database itself. Now only the field-specific table (actually an index page at a time) needs to be loaded into memory and searched, whilst searching the database can only be done record by record, which takes much longer. In de database setup you must create an index explicitly for each field you want to be able to search through quickly in a Collections application.
Further, indexes on link reference fields are a requirement for all linked fields.
There are two special indexes:
• | In Designer, the first is named priref, which stands for "primary reference". In the SQL database these indexed are named after the .inf (e.g. dbo.collect) and Collections uses this tables to list all the record numbers (= the primary references of each record) from your database together with the full contents of the relevant records. A priref index is created automatically when a new database table is defined, and is therefore always present. This index cannot be deleted or modified. |
• | Another special index is the wordlist* (dbo.wordlist in the SQL database). It is created by the database setup in Designer as soon as the first free text index (also called a word index) is defined. This wordlist index contains all words that "occur" in all free text indexes of all database tables. Collections uses this index to save memory when indexing words; thanks to this list, when compiling free text indexes, Collections can suffice with referring to a serial number in the wordlist instead of including whole words. So only the wordlist index contains the actual words, the free text indexes refer only to the wordlist. This process takes place behind the scenes, so you don't need to take it into account when setting up free text indexes. * From Collections 1.14, your databases may be indexed in a new way called Full Text indexing, in which case the dbo.wordlist table won't be present anymore. |
Under the hood, Collections uses these indexes as follows:
• | When you search a Record number access point, just the relevant database table is used to find a record number and the record data. |
• | When you search a Text (term) indexed field, e.g. through an access point, Collections first looks up the search key in the relevant field index to retrieve the record number(s) of the records in which the value occurs. Then those record numbers will be looked up in the relevant database table to retrieve the data. |
• | When you search an index on a link reference field, e.g. through an access point or when validating input for a linked field, there are three indexes to take into account. In the linked database there is a Text index on the relevant field, which holds the actual keys (terms or names) and their record number in that database. Collections uses the database definitions to find out about this index and then uses it to search for the search key and retrieve the accompanying record number. Now, for the current database there is an index which lists the record numbers of the mentioned linked terms or names (since a field linked by reference stores only the linked record number of the term it displays), accompanied by the (local) record numbers of the records in the current database in which the relevant field(s) link to said term or name. Collections uses this index and the retrieved linked record number to find the relevant local record numbers. Then finally, the data of those local record numbers are available through the database table. |
In current free text indexes (in non-Full Text indexed databases), all words, however short, are indexed. Only separator characters are stripped out. These are: [];,!@()|{}<>? and spaces, new lines and tabs. All combinations with concatenators are indexed. For example: l‘arbre is indexed as l’arbre and as arbre. In Full Text indexed databases, even more punctuation characters are stripped from the indexed words.
This makes it easier to find records.
(Re)building an index
After you create an index definition, the index table in the SQL database must still be built up, filling it with the appropriate data from the database. In the Application browser, when you create a new index by dragging an existing field to an index list, the index file will automatically be built. If you create a new index from scratch in an index list, you will have to build it manually after you've set it up, by reindexing it. With reindexing you build or rebuild an index using the data in the database. This will be necessary if you have modified the index definition, added a new index to an existing database, or if the existing index has become unusable for some reason (e.g. a power cut).
See also