ActiveState Powered by ActiveState

Recipe 325205: Cache decorator in python 2.4


The latest version of Python introduced a new language feature, function and method decorators (PEP 318, http://www.python.org/peps/pep-0318.html). This recipe show a common callable transformation that can benefit from the new syntax, often referred to as Memoization pattern.

Python
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
import types

def cachedmethod(function):
    return types.MethodType(Memoize(function), None)


class Memoize:
    def __init__(self,function):
        self._cache = {}
        self._callable = function
            
    def __call__(self, *args, **kwds):
        cache = self._cache
        key = self._getKey(*args,**kwds)
        try: return cache[key]
        except KeyError:
            cachedValue = cache[key] = self._callable(*args,**kwds)
            return cachedValue
    
    def _getKey(self,*args,**kwds):
        return kwds and (args, ImmutableDict(kwds)) or args    


class ImmutableDict(dict):
    '''A hashable dict.'''

    def __init__(self,*args,**kwds):
        dict.__init__(self,*args,**kwds)
    def __setitem__(self,key,value):
        raise NotImplementedError, "dict is immutable"
    def __delitem__(self,key):
        raise NotImplementedError, "dict is immutable"
    def clear(self):
        raise NotImplementedError, "dict is immutable"
    def setdefault(self,k,default=None):
        raise NotImplementedError, "dict is immutable"
    def popitem(self):
        raise NotImplementedError, "dict is immutable"
    def update(self,other):
        raise NotImplementedError, "dict is immutable"
    def __hash__(self):
        return hash(tuple(self.iteritems()))


if __name__ == '__main__':
    from math import sqrt,log,sin,cos
        
    class Example:
        def __init__(self,x,y):
            # self._x and self._y should not be changed after initialization
            self._x = x
            self._y = y
        
        @cachedmethod
        def computeSomething(self, alpha, beta):
            w = log(alpha) * sqrt(self._x * alpha + self._y * beta)
            z = log(beta) * sqrt(self._x * beta + self._y * alpha)
            return sin(z) / cos(w)

Discussion

Memoization is a useful pattern for improving the amortized performance of a computationally expensive function or method. The new syntax allows making explicit that a callable is memoized.

A restriction of the recipe above is that the arguments of the function (and also the method receiver object for methods) have to be hashable.

The decorator can be generalized by allowing different caching policies (e.g. a FIFO cache or a cache implementing an LRU policy) apart from the implied "cache-forever" policy of a dict.

Comments

  1. 1. At 3:36 a.m. on 29 oct 2004, Alex Naanou said:

    to make this more generic I'd update the __call__ method to:

    def __call__(self, *args, **kw):
        cache = self._cache
        try:
            return cache[(args, kw)]
        except KeyError:
            cachedValue = cache[(args, kw)] = self._callable(*args, **kw)
            return cachedValue
    
  2. 2. At 6:03 a.m. on 31 oct 2004, Sridhar R said:

    Memory boom! Memoization is nice technique. But making it implicit will lead to programmers forgetfullness in controlling memory usage the cache takes.

    "Explicit is better than Implicit"

  3. 3. At 11:08 a.m. on 1 nov 2004, George Sakkis (the author) said:

    Handling *keywords. Unfortunately it's not that straightforward; a dict is not hashable, so you can't use it as a key in the cache. One way to make this work is to define an ImmutableDict subclass that is (__hash__)able. Then wrap *kwds into such an ImmutableDict before caching them. However I wanted to keep this recipe simple, hence the less general solution.

    Anyway, I updated the recipe for handling keyword arguments too.

  4. 4. At 11:31 a.m. on 1 nov 2004, George Sakkis (the author) said:

    That's what is neat about decorators; they make it explicit at the point of method definition. It is also straightforward to replace the dict used for cache with any other appropriate class to address issues such as memory consumption, efficiency or anything else that is important for the application:

    def cachedmethod(cache = dict()):
        return lambda function: types.MethodType(Memoize(function,cache), None)
    
    class Memoize:
        def __init__(self, function, cache = dict()):
            self._callable = function
            self._cache = cache
    
         # rest methods
         # (...)
    
    # uses a dict
    @cachedmethod()
    def example1(x,y,z):
        pass
    
    # uses an LRU policy with a cache of size 1000
    @cachedmethod(cache = LRU(size=1000))
    def example2(x,y,z):
        pass
    
  5. 5. At 10:12 p.m. on 10 nov 2004, Connelly Barnes said:

    Simple is better than complex.

    import pickle, copy
    def memoize(f, cache={}):
      d = copy.copy(cache)
      def g(*args, **kwargs):
        key = pickle.dumps((args, kwargs))
        if not key in d:
          d[key] = f(*args, **kwargs)
        return d[key]
      return g
    

    Works with builtin dicts, lists, allows arbitrary cache objects. Yes, this will use up your memory and crash your system if the cache gets too big.

  6. 6. At 9:21 a.m. on 27 apr 2006, Chris Spencer said:

    Broken. This doesn't work with Python 2.4.

    Traceback (most recent call last):
      File "memo-test.py", line 48, in ?
        class Example:
      File "memo-test.py", line 54, in Example
        @cachedmethod
      File "memo-test.py", line 4, in cachedmethod
        return types.MethodType(Memoize(function), None)
    TypeError: unbound methods must have non-NULL im_class
    
  7. 7. At 11:29 a.m. on 27 apr 2006, Chris Spencer said:

    A working example. This will corretly decorate a method:

    def memoize(function):
        return Memoize(function)
    
    class Memoize(object):
        def __init__(self, fn):
            self.cache={}
            self.fn=fn
    
        def __get__(self, instance, cls=None):
            self.instance = instance
            return self
    
        def __call__(self,*args):
            if self.cache.has_key(args):
                return self.cache[args]
            else:
                object = self.cache[args] = self.fn(self.instance, *args)
                return object
    
  8. 8. At 5:52 a.m. on 29 jun 2006, Yair Chuchem said:

    set. You can use set instead of ImmutableDict

  9. 9. At 1:59 a.m. on 20 jun 2007, Radek Szklarczyk said:

    Beautiful is better than ugly. The function below is a nice compromise between simplicity and power. Unlike the version above you can use this decorator to safely decorate more than one function (thanks to the fact that functions are hashable).

    def memoize(f, cache={}):
        def g(*args, **kwargs):
            key = ( f, tuple(args), frozenset(kwargs.items()) )
            if key not in cache:
                cache[key] = f(*args, **kwargs)
            return cache[key]
        return g
    

    Note that args list is converted to a tuple (which is hashable, unlike the list) and the dictionary *kwargs is converted to a frozenset (which is, as opposed to a set, hashable). If you don't have frozenset (pre-2.4), you can use tuple(kwargs.items())

Sign in to comment