cfile.h File Reference

The CFile library headers and public definitions. More...

#include <sys/types.h>

Typedefs

typedef cfile_struct CFile

Functions

CFilecfopen (const char *name, const char *mode)
 Open a file for reading or writing.
CFilecfdopen (int filedesc, const char *mode)
 Open a file from a file descriptor.
off_t cfsize (CFile *fp)
 Returns the _uncompressed_ file size.
int cfeof (CFile *fp)
 Returns true if we've reached the end of the file being read.
char * cfgets (CFile *fp, char *str, int len)
 Get a string from the file, up to a maximum length or newline.
char * cfgetline (CFile *fp, char *line, int *maxline)
 Read a full line from the file, regardless of length.
int cfprintf (CFile *fp, const char *fmt,...) __attribute((format(printf
int int cvfprintf (CFile *fp, const char *fmt, va_list ap)
 Print a formatted string to the file, from another function.
int cfread (CFile *fp, void *ptr, size_t size, size_t num)
 Read a block of data from the file.
int cfwrite (CFile *fp, const void *ptr, size_t size, size_t num)
 Write a block of data from the file.
int cfflush (CFile *fp)
 Flush the file's output buffer.
int cfclose (CFile *fp)
 Close the given file handle.


Detailed Description

The CFile library headers and public definitions.


Typedef Documentation

typedef struct cfile_struct CFile


Function Documentation

int cfclose ( CFile fp  ) 

Close the given file handle.

This function frees the memory allocated for the file handle and closes the associated file.

Parameters:
fp The file handle to close.
Returns:
the success of the file close operation.

CFile* cfdopen ( int  filedesc,
const char *  mode 
)

Open a file from a file descriptor.

Allows you to open the file specified by the given file descriptor, with the same mode options as a regular file. Originally necessary to allow access to stdin and stdout, but with the current handling of "-" by cfopen this should be mostly unnecessary.

Parameters:
filedesc An integer file descriptor number.
mode The mode to open the file in ("r" for read, "w" for write).
Returns:
A successfully created file handle, or NULL on failure.
Todo:
Make this detect a compressed input stream, and allow setting of the compression type via the mode parameter for an output stream.

int cfeof ( CFile fp  ) 

Returns true if we've reached the end of the file being read.

This mostly passes through the state of the lower-level's EOF checking. But bzlib doesn't seem to correctly return BZ_STREAM_END when the stream has actually reached its end, so we have to check another way - whether the last buffer read was zero bytes long.

Parameters:
fp The file handle to check.
Returns:
True (1) if the file has reached EOF, False (0) if not.

int cfflush ( CFile fp  ) 

Flush the file's output buffer.

This function flushes any data passed to write or printf but not yet written to disk. If the file is being read, it has no effect.

Parameters:
fp The file handle to flush.
Returns:
the success of the file flush operation.
Note:
for gzip files, under certain compression methods, flushing may result in lower compression performance. We use Z_SYNC_FLUSH to write to the nearest byte boundary without unduly impacting compression.

char* cfgetline ( CFile fp,
char *  line,
int *  maxline 
)

Read a full line from the file, regardless of length.

Of course, with fgets you can't always guarantee you've read an entire line. You have to know the length of the longest line, in advance, in order to read each line from the file in one call. cfgetline solves this problem by progressively extending the string you pass until the entire line has been read. To do this it uses talloc_realloc, and a variable which holds the length of the line allocated so far. If you haven't initialised the line beforehand, cfgetline will do so (allocating it against the file pointer's context). If you have, then the magic of talloc_realloc allocates the new space against the context that you originally allocated your buffer against. So to speak.

In normal usage, this 'buffer' will expand but never contract. It expands to half again its current size, so if you have a very long line lurking in your input somewhere, then it's going to set the buffer size for all the lines after it. If you're concerned by this wasting a lot of memory, then set the length negative (while keeping its absolute size). This will signal to cfgetline to shrink the line buffer after this line has been read. For example, if your line buffer is currently 1024 and you want it to shrink, then set it to -1024 before calling cfgetline. In reality, this is almost never going to be a problem.

Parameters:
fp The file handle to read from.
line A character array to read the line into, and optionally extend.
maxline A pointer to an integer which will contain the length of the string currently allocated.
Returns:
A pointer to the line thus read. If talloc_realloc has had to move the pointer, then this will be different from the line pointer passed in. Therefore, the correct usage of cfgetline is something like 'line = cfgetline(fp, line, &len);'

char* cfgets ( CFile fp,
char *  str,
int  len 
)

Get a string from the file, up to a maximum length or newline.

For gzipped and uncompressed files, this simply uses their relative library's fgets implementation. Since bzlib doesn't provide such a function, we have to copy the implementation from stdio.c and use it here, referring to our own bz_fgetc function.

Parameters:
fp The file handle to read from.
str An array of characters to read the file contents into.
len The maximum length, plus one, of the string to read. In other words, if this is 10, then fgets will read a maximum of nine characters from the file. The character after the last character read is always set to \0 to terminate the string. The newline character is kept on the line if there was room to read it.
See also:
bz_fgetc
Returns:
A pointer to the string thus read.

CFile* cfopen ( const char *  name,
const char *  mode 
)

Open a file for reading or writing.

Open the given file using the given mode. Opens the file and returns a CFile handle to it. Mode must start with 'r' or 'w' to read or write (respectively) - other modes are not expected to work.

Parameters:
name The name of the file to open. If this is "-", then stdin is read from or stdout is written to, as appropriate (both being used uncompressed.)
mode "r" to specify reading, "w" for writing.
Returns:
A successfully created file handle, or NULL on failure.

int cfprintf ( CFile fp,
const char *  fmt,
  ... 
)

int cfread ( CFile fp,
void *  ptr,
size_t  size,
size_t  num 
)

Read a block of data from the file.

Reads a given number of structures of a specified size from the file into the memory pointer given. The destination memory must be allocated first. Some read functions only specify one size, we use two here because that's what fread requires (and it's better for the programmer anyway IMHO).

Parameters:
fp The file handle to read from.
ptr The memory to write into.
size The size of each structure in bytes.
num The number of structures to read.
Returns:
The success of the file read operation.

off_t cfsize ( CFile fp  ) 

Returns the _uncompressed_ file size.

The common way of reporting your progress through reading a file is as a proportion of the uncompressed size. But a simple stat of the compressed file will give you a much lower figure. So here we extract the size of the uncompressed content of the file. Naturally this process is easy with uncompressed files. It's also fairly easy with gzip files - the size is a 32-bit little-endian signed int (I think) at the end of the file. Unfortunately, bzip2 files do not carry this information, so we have to read the entire file through bzcat and wc -c. This is easier than reading it directly, although it then relies on the availability of those two binaries, and may therefore make this routine not portable. I'm not sure if this introduces any security holes in this library. Unfortunately, correspondence with Julian Seward has confirmed that there's no other way of determining the exact uncompressed file size, as it's not stored in the bzip2 file itself.

HOWEVER: we can save the next call to cfsize on this file a considerable amount of work if we save the size in a filesystem extended attribute. Because rewriting an existing file does a truncate rather than delete the inode, the attribute may get out of sync with the actual file. So we also write the current time as a timestamp on that data. If the file's mtime is greater than that timestamp, then the data is out of date and must be recalculated. Make sure your file system has the user_xattr option set if you want to use this feature!

Parameters:
fp The file handle to check
Returns:
The number of bytes in the uncompressed file.

int cfwrite ( CFile fp,
const void *  ptr,
size_t  size,
size_t  num 
)

Write a block of data from the file.

Writes a given number of structures of a specified size into the file from the memory pointer given.

Parameters:
fp The file handle to write into.
ptr The memory to read from.
size The size of each structure in bytes.
num The number of structures to write.
Returns:
The success of the file write operation.

int int cvfprintf ( CFile fp,
const char *  fmt,
va_list  ap 
)

Print a formatted string to the file, from another function.

The standard vfprintf implementation. For those people that have to receive a '...' argument in their own function and send it to a CFile.

Parameters:
fp The file handle to write to.
fmt The format string to print.
ap The compiled va_list of parameters to print.
Returns:
The success of the file write operation.
Todo:
Should we be reusing a buffer rather than allocating one each time?


Generated on Fri Jan 23 11:58:35 2009 for CFile by  doxygen 1.4.7