Welcome, guest | Sign In | My Account | Store | Cart

A frozendict is a dictionary that cannot be modified after being created - but it is hashable and may serve as a member of a set or a key in a dictionary.

Python, 33 lines
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
* Quick'n'dirty version:

class hashabledict(dict):
    def __hash__(self):
        return hash(tuple(sorted(self.items())))

* Fancy version:

class frozendict(dict):
    def _blocked_attribute(obj):
        raise AttributeError, "A frozendict cannot be modified."
    _blocked_attribute = property(_blocked_attribute)

    __delitem__ = __setitem__ = clear = _blocked_attribute
    pop = popitem = setdefault = update = _blocked_attribute

    def __new__(cls, *args):
        new = dict.__new__(cls)
        dict.__init__(new, *args)
        return new

    def __init__(self, *args):
        pass

    def __hash__(self):
        try:
            return self._cached_hash
        except AttributeError:
            h = self._cached_hash = hash(tuple(sorted(self.items())))
            return h

    def __repr__(self):
        return "frozendict(%s)" % dict.__repr__(self)

I came up with the quick'n'dirty version to eliminate duplicates in a list of dicts by converting it to a set. It doesn't actually verify that the dict is not modified and should be used with care.

The fancy version blocks access to methods that might change the dictionary and also caches the computed hash for better performance. The name frozendict is by analogy to the builtin frozenset type.

This implementation assumes that all dictionary keys and values are hashable and that the keys are fully comparable. It won't work with complex keys. Sorting is required because dictionary items() are returned in arbitrary order. An alternative implementation may combine item hashes using an order-independent function like xor or addition.

8 comments

Oren Tirosh (author) 18 years, 10 months ago  # | flag

Ooops. I forgot about keyword args:

    def __new__(cls, *args, **kw):
        new = dict.__new__(cls)
        dict.__init__(new, *args, **kw)
        return new

    def __init__(self, *args, **kw):
        pass

>>> set([frozendict(a=1,b=2), frozendict(a=5), frozendict(b=2,a=1)])
set([frozendict({'a': 1, 'b': 2}), frozendict({'a': 5})])
Amos Newcombe 18 years, 10 months ago  # | flag

Dictionaries in arbitrary order. Is not the order consistent, as long as the dictionary is not modified in the meantime? This is what I get from the (version 2.3) Library Reference, section 2.3.7: Mappings, and it is consistent with my experience. If so, you really don't need to sort the items.

Oren Tirosh (author) 18 years, 9 months ago  # | flag

Dictionary order consistency. If you iterate over the same dictionary object two times in a row without changing it in the middle the order will stay the same. Python makes no guarantees that two different dictionaries testing as equal will have the same item order.

bearophile - 18 years, 5 months ago  # | flag

The __init__ method can be removed, and maybe the __hash__ method can be faster (and use less memory) using a .sort() instead of sorted(), and using if "_cached_hash" in self.__dict__: instead of the the try-except.

Oren Tirosh (author) 18 years, 5 months ago  # | flag

Nope. __init__ is there for a reason.

dict.__init__ is essentially equivalent to dict.update and could be used to modify the frozen dictionary. It's ignored rather than raising an exception because it will be called once on initialization.

Raymond Hettinger 18 years, 4 months ago  # | flag

Doesn't work when dict's values are mutable.

>>> hash(frozendict({'k': [1,2,3]}))

Traceback (most recent call last):
  File "", line 1, in
    hash(frozendict({'k': [1,2,3]}))
  File "C:/pydev\frozendict.py", line 21, in __hash__
    h = self._cached_hash = hash(tuple(sorted(self.items())))
TypeError: list objects are unhashable
Ero Carrera 16 years, 1 month ago  # | flag

Extended version of frozent dict. This extended version of the code above will also "freeze" dictionaries and lists stored as values of the dictionary. If dictionaries/lists are found among the values, they will be handled recursively.

class frozendict(dict):
    def _blocked_attribute(obj):
        raise AttributeError, "A frozendict cannot be modified."
    _blocked_attribute = property(_blocked_attribute)

    __delitem__ = __setitem__ = clear = _blocked_attribute
    pop = popitem = setdefault = update = _blocked_attribute

    def __new__(cls, *args, **kw):
        new = dict.__new__(cls)

        args_ = []
        for arg in args:
            if isinstance(arg, dict):
                arg = copy.copy(arg)
                for k, v in arg.items():
                    if isinstance(v, dict):
                        arg[k] = frozendict(v)
                    elif isinstance(v, list):
                        v_ = list()
                        for elm in v:
                            if isinstance(elm, dict):
                                v_.append( frozendict(elm) )
                            else:
                                v_.append( elm )
                        arg[k] = tuple(v_)
                args_.append( arg )
            else:
                args_.append( arg )

        dict.__init__(new, *args_, **kw)
        return new

    def __init__(self, *args, **kw):
        pass

    def __hash__(self):
        try:
            return self._cached_hash
        except AttributeError:
            h = self._cached_hash = hash(tuple(sorted(self.items())))
            return h

    def __repr__(self):
        return "frozendict(%s)" % dict.__repr__(self)
Søren Løvborg 14 years, 1 month ago  # | flag

The hash function relies on sorting, but many data types (e.g. sets) do not sort consistently. A more reliable alternative is:

h = self._cached_hash = hash(frozenset(self.items()))

(In addition, for large data sets this ought to run in O(n) time rather than the O(n log n) of the original.)

Created by Oren Tirosh on Mon, 16 May 2005 (PSF)
Python recipes (4591)
Oren Tirosh's recipes (16)

Required Modules

  • (none specified)

Other Information and Tasks