The CFile Library

Introduction

Put simply, this library is designed to allow your code to read or write a file regardless of whether it is uncompressed, or compressed with either bzip2 or gzip. It automatically detects the compression type from the file's extension and encapsulates the appropriate library routines in a common interface. If the file name is "-", then stdin or stdout is opened as appropriate. As a further service, the cfgetline() routine allows you to read lines of any size from your input file, automatically resizing the buffer to suit. Other convenience routines, such as cfsize(), are provided.

Requirements

The following libraries are required for CFile:

Optional extras

If the libmagic library is defined at the time of compiling the cfile library, then libmagic will be used to determine the type of files being read. Files being written will still have their type determined by their file extension.

In order to actually save the uncompressed file size of bzip2 files once calculated, your file system should have extended user attributes enabled. This can be set by having the user_xattr option set in the mount table. You may need to remount your file system with mount -o remount /mountpoint in order to enable this functionality. If this is not set, or other factors don't allow the extended user attribute to be written, then no bad will occur - it'll just mean that the size will be calculated from scratch each time...

Aims

To allow you to read or write files whether it is compressed or not.

To provide extra, useful functions like cfgetline().

To provide a consistent parameter passing interface rather than having to know exactly what is passed where and in what form.

Notes

The file extension for gzip files is '.gz'.

The file extension for bzip2 files is '.bz2'.

If an uncompressed file is being read, the stdio routines will always be used, despite zlib supporting opening and reading both gzip-compressed files and uncompressed files.

CFile files do not support random access, simultaneous read and write access, or appending.

Todo:
Add better error and EOF checking, particularly for bzip.
Todo:
Allow only read or write modes, with no appending.
Todo:
Allow extra parameters in the mode string to specify compression options.
Todo:
Use the buffer to write to: avoids allocating a new temporary buffer upon each cfprintf() and cvfprintf().
Todo:
Tridge noted that the standard implementation of stdio has pointers in the file handle that refer to the functions that are called when performing operations on that file handle. It may therefore be able to provide a wrapper that allows callers to simply replace a include <stdio.h> with include <cfile.h> and all file operations would then happen transparently. The modified fopen would determine the file type and update the jump block with the relevant functions (either direct calls to the functions in e.g. zlib, or wrappers that implement the correct semantics. So the whole thing would be a 'drop in' replacement for stdio, rather than requiring modification of existing code.

Generated on Fri Jan 23 11:58:34 2009 for CFile by  doxygen 1.4.7