Directory Files

Top  Previous  Next

 

A directory file is represented by an operating system directory and the records within it by operating system files. The record key is the name of the operating system file holding the data for the record except where this would be an invalid name in which case QM performs automatic name mapping as described below.

 

Directory files do not give high performance because the process of searching a directory for a file is, with many operating systems, essentially a linear scan. Locating a record to be read would, therefore, require on average that half of the entries in the directory are examined. Writing a new record would require the entire directory to be processed to verify that the file does not already exist.

 

Directory files are mainly used for data that is to be processed from outside of QM or for very large records (hundreds of kilobytes) where the operating system file structures may give better performance than the hashed file system. Typical uses include storage of QMBasic programs, COMO (command output) files, and saved select lists.

 

When a record is written to a directory file, any field mark characters are converted to the operating system dependent representation of a newline. Thus, each field becomes a line of text which allows the data to be processed by external software that does not understand the concept of field marks. Conversely, when data is read from a directory file, the newlines are translated to field marks. Where the data contains value marks or subvalue marks, these are not translated as it is assumed that whatever software will process this data must understand multivalued data.

 

One common use of directory files is to store scanned documents, digital photographs, etc. In this case, the data is not text divided into fields using the field mark character but is simple binary data that may contain any sequence of bytes. The data will nearly always contain bytes that appear to be field marks and other bytes that are the ASCII linefeed character. On writing the data to disk, the field marks will be converted to newlines. On reading the record back again, all of the newlines get converted to field marks such that the record does not match the original data written. This is clearly unacceptable. Application developers using director files to store binary data must suppress the translation of field marks by use of the QMBasic MARK.MAPPING statement.

 

Where a record id contains characters that are not valid in operating system file names, QM automatically replaces them with an alternative representation. This is totally invisible from inside QM but other software that accesses directory file records must allow for these translations. Rather than have a different set of translations for each platform, QM adopts a single set based on the most restrictive platform (Windows) so that data may be moved between environments without modification of record names. The translations performed are:

*

%A

"

%Q

\

%B

/

%S

,

%C

+

%V

=

%E

:

%X

>

%G

;

%Y

<

%L

?

%Z

%

%P

 

 

 

Depending on the operating system in use, record ids in directory files may be case insensitive.

 

Note also that the Windows file system does not allow file names that clash with Windows device names such as COM.

 

When writing a record to a directory file, QM normally opens the operating system file that will represent this record and writes to it, overwriting any existing data. There is a possibility of data loss when updating an existing record if the system fails during this write (e.g. a power outage) or if there is insufficient disk space. To prevent this, the SAFEDIR configuration parameter can be set to adopt a "safe update" technique where the data is written to a temporary file, the original is deleted and the temporary item is renamed to replace the original. This removes nearly all possibility of losing the record but degrades performance of the write.

 

Records in directory files may be read, written or deleted by applications in exactly the same way as records in hashed files. The QMBasic programming language provides some additional operations for directory file access. A record may be opened using the OPENSEQ statement and then processed on a line by line basis (READSEQ, WRITESEQ, etc) or as a simple binary item (READBLK, WRITEBLK, etc). In addition, programming statements are provided to simplify processing of comma separated variable format data (READCSV, WRITECSV).