Welcome, guest | Sign In | My Account | Store | Cart

I often return result sets from a database call using a list of dictionary objects. When transmitting the pickled list object over the wire, the size of the pickle greatly effects the speed of the transmission.

I wrote this small class to emulate a list of dictionary objects without the memory and pickle storage overhead which occurs when storing every item in the list as a dictionary.

Python, 53 lines
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
#!/usr/bin/python2.4
import types

class Table(object):
    """A structure which implements a list of dict's."""
    def __init__(self, *args):
        self.columns = args
        self.rows = []
        
    def _createRow(self, k,v):
        return dict(zip(k, v))
    
    def append(self, row):
        if type(row) == types.DictType:
            row = [row[x] for x in self.columns]
        row = tuple(row)
        if len(row) != len(self.columns):
            raise TypeError, 'Row must contain %d elements.' % len(self.columns)
        self.rows.append(row)
    
    def __iter__(self):
        for row in self.rows:
            yield self._createRow(self.columns, row)
    
    def __getitem__(self, i):
        return self._createRow(self.columns, self.rows[i])
    
    def __setitem__(self, i, row):
        if type(row) == types.DictType:
            row = [row[x] for x in self.columns]
        row = tuple(row)
        if len(row) != len(self.columns):
            raise TypeError, 'Row must contain %d elements.' % len(self.columns)
        self.rows[i] = row
    
    def __repr__(self):
        return ("<" + self.__class__.__name__ + " object at 0x" + str(id(self))
                + " " + str(self.columns) + ", %d rows.>" % len(self.rows))




if __name__ == "__main__":
    import pickle
    t = Table("a","b","c")
    for i in xrange(10000):
        t.append((1,2,3))
    print "Table size when pickled:",len(pickle.dumps(t))
    
    t = []
    for i in xrange(10000):
        t.append({"a":1,"b":2,"c":3})
    print "List size when pickled: ",len(pickle.dumps(t))
    

    
  

The size of the pickled table is reduced by ~ 50%, which provides worthwhile speedups when sending the pickle over a slow or busy connection.

It works just like a regular list of dictionaries, except that the dictionary returned by the __getitem__ or __iter__ calls is generated dynamically. It does not support slices, but this could easily be added when needed.

2004/11/09: Changed .__setitem__ and .append so that lists, tuples, or dicts can be inserted.

1 comment

Ian Bicking 19 years, 5 months ago  # | flag

dbrow. dbrow does something like this, though it probably wouldn't work in a pickle. But it is memory efficient and speed efficient. Available at:

http://opensource.theopalgroup.com/

Created by S W on Sat, 6 Nov 2004 (PSF)
Python recipes (4591)
S W's recipes (20)

Required Modules

Other Information and Tasks