Abstract:

Providing a general-purpose searching service for genomic databases requires remote retrieval of large genomic database files. Genbank, for example, contained almost 23 gigabytes of data in the latest release.
 
For such applications, it is desirable that the transmission costs from the server hosting the database files to the search server be minimised. We describe here a new tool, which we call latte, for compressed retrieval of flat-file sequence records. We show that latte typically reduces the time and communication cost of retrieving GenBank files by more than 60%.