|
Description:
I've written a small class to handle a gzip pipe that won't read the whole source file at once, but will deliver small chunks of data on demand.
Source: Text Source
from gzip import GzipFile
from StringIO import StringIO
class GZipPipe(StringIO) :
"""This class implements a compression pipe suitable for asynchronous
process.
Only one buffer of data is read/compressed at a time.
The process doesn't read the whole file at once : This improve performance
and prevent hight memory consumption for big files."""
CHUNCK_SIZE = 1024
def __init__(self, source = None, name = "data") :
"""Constructor
@param source Source data to compress (as a stream/File/Buffer - anything with a read() method)
@param name Name of the data within the zip file"""
self.source = source
self.source_eof = False
self.buffer = ""
StringIO.__init__(self)
self.zipfile = GzipFile(name, 'wb', 9, self)
def write(self, data) :
"""The write mzthod shouldn't be called from outside.
A GZipFile was created with this current object as a output buffer anbd it
fills it whenever we write to it (calling the read method of this object will do it for you)
"""
self.buffer += data
def read(self, size = -1) :
"""Calling read() on a zip pipe will suck data from the source stream.
@param size Maximum size to read - Read whole compressed file if not specified.
@return Compressed data"""
while ((len(self.buffer) < size) or (size == -1)) and not self.source_eof :
if self.source == None: break
chunk = self.source.read(GZipPipe.CHUNCK_SIZE)
self.zipfile.write(chunk)
if (len(chunk) < GZipPipe.CHUNCK_SIZE) :
self.source_eof = True
self.zipfile.flush()
self.zipfile.close()
break
if size == 0:
result = ""
if size >= 1 :
result = self.buffer[0:size]
self.buffer = self.buffer[size:]
else :
result = self.buffer
self.buffer = ""
return result
Discussion:
This is useful when writing a mono-thread server based on sockets.
It improves performance a lot when dealing with very large files.
This class uses an internal buffer.
When the "read" method is called, it feeds the internal zip object that writes back to the buffer. When the buffer is large enough (larger than the requested size) the read method returns back the proper amount of data.
I would like to do the same job for tar files, but the tar module doesn't seem to provide a way to add just a chunk of data ; Only a whole file at once.
I'd be glad if someone could help.
|