Welcome, guest | Sign In | My Account | Store | Cart

True immutable symbolic enumeration with qualified value access.

Python, 52 lines
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
def Enum(*names):
   ##assert names, "Empty enums are not supported" # <- Don't like empty enums? Uncomment!

   class EnumClass(object):
      __slots__ = names
      def __iter__(self):        return iter(constants)
      def __len__(self):         return len(constants)
      def __getitem__(self, i):  return constants[i]
      def __repr__(self):        return 'Enum' + str(names)
      def __str__(self):         return 'enum ' + str(constants)

   class EnumValue(object):
      __slots__ = ('__value')
      def __init__(self, value): self.__value = value
      Value = property(lambda self: self.__value)
      EnumType = property(lambda self: EnumType)
      def __hash__(self):        return hash(self.__value)
      def __cmp__(self, other):
         # C fans might want to remove the following assertion
         # to make all enums comparable by ordinal value {;))
         assert self.EnumType is other.EnumType, "Only values from the same enum are comparable"
         return cmp(self.__value, other.__value)
      def __invert__(self):      return constants[maximum - self.__value]
      def __nonzero__(self):     return bool(self.__value)
      def __repr__(self):        return str(names[self.__value])

   maximum = len(names) - 1
   constants = [None] * len(names)
   for i, each in enumerate(names):
      val = EnumValue(i)
      setattr(EnumClass, each, val)
      constants[i] = val
   constants = tuple(constants)
   EnumType = EnumClass()
   return EnumType


if __name__ == '__main__':
   print '\n*** Enum Demo ***'
   print '--- Days of week ---'
   Days = Enum('Mo', 'Tu', 'We', 'Th', 'Fr', 'Sa', 'Su')
   print Days
   print Days.Mo
   print Days.Fr
   print Days.Mo < Days.Fr
   print list(Days)
   for each in Days:
      print 'Day:', each
   print '--- Yes/No ---'
   Confirmation = Enum('No', 'Yes')
   answer = Confirmation.No
   print 'Your answer is not', ~answer

Most propositions for an enum in python attempt to solve the issue with a single class. However, fact is that enum has a dual nature: It declares a new anonimous type and all possible instances (values) of that type at the same time. In other words, there is a distinction between an enum type and its associated values.

In recognition of this fact, this recipe uses two python classes and python's nested scopes to accomplish a clean and concise implementation.

Note that - Enums are immutable; attributes cannot be added, deleted or changed. - Enums are iterable. - Enum value access is symbolic and qualified, ex. Days.Monday (like in C#). - Enum values are true constants. - Enum values are comparable. - Enum values are invertible (usefull for 2-valued enums, like Enum('no', 'yes'). - Enum values are usable as truth values (in a C tradition, but this is debatable). - Enum values are reasonably introspecitve (by publishing their enum type and numeric value)

Cheers and happy enumerating!

[See also]

Recipe "Enums for Python" by Will Ware http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/67107

Recipe "Enumerated values by name or number" by Samuel Reynolds http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/305271

27 comments

Eric Noyau 18 years, 10 months ago  # | flag

Small nit. Nice solution, the best so far. My only issue is the interaction between enumerations when you mistakenly compare them:

C1 = Enum('No', 'Yes')
C2 = Enum('Yes', 'No')

# Both asserts pass without an issue...
assert C1.No != C2.No    ## Kind of okay I guess
assert C1.No == C2.Yes ## Yikes!

Changing the __cmp__ method on EnumValue to do a == instead of cmp changes the results in a more comfortable way (No == No and No != Yes) but is still uncomfortable

I believe two constants from two different enums should not be comparable at all, there is no sense in mixing apples and oranges. A solution is replacing the cmp with the following:

def __cmp__(self, other):
    return cmp(self.EnumType, other.EnumType) or cmp(self.__value, other.__value)

In that case you end up comparing only similar things.

Zoran Isailovski (author) 18 years, 10 months ago  # | flag

You're perfectly right ... ... different enums should not be comparable at all.

The question is: Is it better to assert that, or always treat values from different enums as different, as you propose.

A flaw of the latter is that cmp cannot really tell if an EnumType is less or greater then another, so that for ex. sorting a list of different enums would result in random ordering, which might not be a lucky result.

So, probably it's better to assert then to guess, I guess. Adding the assertion to the recipe ...

Zoran Isailovski (author) 18 years, 10 months ago  # | flag

... though this breaks C tradition. ... as C does allow you to compare apples and oranges, but I'm the last to complain about it.

Matthew Bennett 18 years, 10 months ago  # | flag

Here's a small potential improvement: If you add:

if len(names) == 1:
    names = tuple(names[0].split(' '))

right after the commented-out assertion, then you can just use a space-separated string for the enum values:

Days = Enum('Mo Tu We Th Fr Sa Su')

The user could of course do this manually, but it would be nice for the Enum to automatically do it.

Zoran Isailovski (author) 18 years, 10 months ago  # | flag

I generally don't favor "omnipotent" code - meaning code that claims to deal with "everything". This kind of approach always turned against me. Instead, I favor clear separation of concerns and responsibilities, according to the maxim "do one thing, and do it well".

Anyway, I doubt whether a "simplification" of (the already very simple)

Days = Enum(*'Mo Tu We Th Fr Sa Su'.split())

is worth the price of introducing a special case. What if someone liked

Days = Enum('Mo,Tu.We,Th,Fr,Sa,Su')?

I think this is best left up to the user.

Matthew Bennett 18 years, 9 months ago  # | flag

You're probably right. Nothing's stopping the user from making a wrapper function to do that kind of thing.

Christopher Smith 18 years, 9 months ago  # | flag

Enums from lists. I'm using this recipe, and would like to have enumerations built of existing lists of strings.

However, when I have anything other than explicit function arguments in the call to Enum, python 2.4.1 chokes on

 __slots__    =   names

This seems a little counter to what http://docs.python.org/ref/slots.html says, so, if you have any insights, I'd like to hear it.

Also,

constants = tuple(constants)

three lines from the end, is a mysterious line. How does tuple-ifying the contants list affect the return value?

Great code!

Zoran Isailovski (author) 18 years, 9 months ago  # | flag

Thanks, Christopher. I'm glad you liked this recipe.

About __slots__ = names choking: Unfortunatly, from your description I couln't tell what went wrong, but maybe this would help on using string lists with Enum:

# create a list of strings ...
# either explicitly ...
dayNames = ['Mo', 'Tu', 'We', 'Th', 'Fr', 'Sa', 'Su']
# or by calculation ...
dayNames = *'Mo Tu We Th Fr Sa Su'.split()
# and use it to create the enum ...
Days = Enum(*dayNames) # note the asterisk in front of 'dayNames'
# Take care that all enum names are valid python identifiers,
# otherwise __slots__ = names will fail
Days = Enum('X', '~Y') # '~Y' is not a valid python identifier

Note that assignments to __slots__ are checked by the python interpreter - a fact that is not documented at http://docs.python.org/ref/slots.html. Perhaps this explains the problem.

And now for the mysterious line

constants = tuple(constants)

I know it does not seem to add any functionality, so it might seem a bit strange at first sight. At a second sight, it does add a value by revealing a certain intention: By making 'constants' imutable, it expresses that the content of 'constants' will not change from that point on, i.e. it really is what it says - a constant. (However, it is said that intention revealing code, while easy to read, is very hard to write, and that line of code is yet another proof. Have a look at what Martin Fowler says about the subject at http://www.martinfowler.com/articles/designDead.html )

Christopher Smith 18 years, 9 months ago  # | flag

Note the dull thud. Days = Enum(*dayNames) # note the asterisk in front of 'dayNames'

^^^^^^^^^^^^^

...as the cluestick impacts the source of ignorance. :)

Hamish Moffatt 16 years, 9 months ago  # | flag

Can't be picked. I can't pickle an instance of the Enum type; pickle complains that __getstate__ must be defined if __slots__ is defined.

Guillaume Knispel 16 years ago  # | flag

license for this code ?

Hi,

This is really handy! Thanks a lot for sharing this :)

Is there a license for this marvellous code so we can use and redistribute it in real projects?

Cheers!
Timothee Besset 15 years, 8 months ago  # | flag

I have the same problem. I understand the fix is to add a proper __getstate__ __setstate__ function to the class, this is however a bit beyond my current python knowledge.

Timothee Besset 15 years, 6 months ago  # | flag

I spent a few hours looking at the pickling problem and it is indeed non trivial. You can start by adding __getstate__ __setstate__ functions, but that won't help because the classes are nested inside the Enum function.

Being that they use __slots__ (a class variable) my understanding is that you can't make them non nested without them stepping on each other. i.e. every time you call the Enum() function you are creating two new EnumClass and EnumValue classes (that are distinguished from the others by their scoping)

Raymond Hettinger 15 years, 1 month ago  # | flag

Nice recipe but I think the whole exercise is a misguided. Enums are a great tool for staticly compiled languages and for languages without namespaces.

Why provide more ways to do it when Python already has several tools that work fine? We routinely use module namespaces such as re.IGNORECASE. Likewise, it's already trivially easy to build constants with a simple class:

class Status: open, pending, closed = range(3)

As soon as you hide this simple declaration behind a wrapper like Enum(), you lose transparency. It stops being immediately obvious that an Enum() can be pickled, or compared, or used as a dictionary key, or will pass isinstance(e, numbers.Number), or pass operator.isNumberType(e). If you subclass an enumeration, the __slots__ don't automatically carry forward and you end-up with instance dictionaries and whatnot. Too many unexpected behaviors.

Coming from Java or C where Enums are a way of life, it is hard to leave those habits behind and program Pythonically. But, not everything in the world of static compilation makes sense in the python world. These Enum() objects are slower than constants assigned to a variable name or module namespace. The are also awkward in that those objects are imbued with behaviors than are a little different than straight int or string constants. As such, they place an additional burden on the reader, a burden that doesn't exist with a simpler, more basic Pythonic style.

Zoran Isailovski (author) 14 years, 8 months ago  # | flag

Raymond, "misguided"???

I'm not coming from Java and C, but I do think enumerations have several advantages over dull workarounds around them. Java's workaround used to be

public class Status {
  public final int open = 0;
  public final int pending = 1;
  public final int closed = 2;
}

This is simply the wrong abstraction to express the actual intention (to declare a set of abstract values that can be ordered and accessed by their symbolic name). Their values are not of interest; Superfluous, absolutely unnecessary information. Because: As with any equation, you always have to change the right side to match the changes on the left side of the equation sign. You loose flexibility due to over-specification. Java's creators realized this and added enums to the language.

What you are basically suggesting here is the same dull workaround as Java had before (though much less verbose thank to Python's cool syntax). But it is still over-specified. You still have to change the right side of the equation to match changes on the left.

Enumeration just provide better abstractions than such workarounds. Are you saying that abstractions matter only in "statically compiled languages and for languages without namespaces"?

No, Raymond, I don't think at all that providing better abstractions puts an additional burden to anyone (be them statically compiled or not).

I admit, performance is on your side, but good abstractions, expressiveness and conceptual clarity matter more.

As for "imbued with behaviors": They should be a "little different than straight int or string constants". Enums are something else than integers and strings. (Unless you are coming from Java and C, and your view on enums is biased. ;-) )

LBNL, I think in a place like this, where people are supposed to showcase with their recipes what is possible with python, a comment like "why provide more ways" is, frankly, somewhat misplaced. That's the whole purpose of this site: To provide more ways. There is no point in suppressing variety around here.

Luke Dickens 14 years, 7 months ago  # | flag

Great solution, precisely what I needed.

One additional comment about comparisons. I want to be able to check whether a variable has been assigned an enum, as I do for other objects. However, when I try:

>>> days = Enum('Mo', 'Tu', 'We', 'Th', 'Fr', 'Sa', 'Su')
>>> day = None
>>> if day:
...   print 'day is assigned'
... else:
...   print 'day is none'
day is none
>>> day = days.Tu
>>> if day:
...   print 'day is assigned'
... else:
...   print 'day is none'
day is assigned
>>> day = days.Mo
>>> if day:
...   print 'day is assigned'
... else:
...   print 'day is none'
day is none

Ooops. If, by default, we really do not care what the underlying values associated with our enums are, should there be this difference between 'days.Mo' and 'days.Tu'.

Perhaps an alternative would be to explicitly compare my enums to the None object, but

>>> day = days.Mo
>>> if day == None:
...   print 'day is none'
... else:
...   print 'day is assigned'
... 
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "enum.py", line 27, in __cmp__
    assert self.EnumType is other.EnumType, "Only values from the same enum are comparable"
AttributeError: 'NoneType' object has no attribute 'EnumType'

Can I therefore suggest one of the following two additions to the code in the EnumValue class.

Either add the following method:

  def __nonzero__(self):     return True

This means 'bool(days.Mo)' resolves to 'True'.

Alternatively, alter the '__cmp__' method to the following:

  def __cmp__(self, other):
     if other == None:                          # compare explicitly with the None object
        return cmp(self.Value,None)
     assert self.EnumType is other.EnumType, "Only values from the same enum are comparable"
     return cmp(self.__value, other.__value)

What are your thoughts? I do not like the asymmetry or overhead of the '__cmp__' function, but one or the other (or both) allows me to avoid having a 'no_enum' element in every enum list.

Luke Dickens 14 years, 7 months ago  # | flag

Just a quick follow-up to the last comment. I realised that there was less overhead with the following __cmp__ function

  def __cmp__(self, other):
     try:
        assert self.EnumType is other.EnumType, "Only values from the same enum are comparable"
        return cmp(self.__value, other.__value)
     except AssertionError, ae:
        if other == None:    return 1
        raise ae

And to see the overhead for each I wrote the following (rough-and-ready) test function

>>> from time import *
>>> def muleqtest(a,b,n):
...   t=time()
...   for i in xrange(n):
...     a == b
...   return time()-t
...

For your original '__cmp__' function:

>>> muleqtest(days.Mo,days.Mo,10000000)
15.306710958480835
>>> muleqtest(days.Mo,days.Tu,10000000)
15.436569929122925

With the if statement:

>>> muleqtest(day,None,10000000)
12.248291969299316
>>> muleqtest(days.Mo,days.Mo,10000000)
21.300351858139038
>>> muleqtest(days.Mo,days.Tu,10000000)
21.500328063964844

And with the try-except block:

>>> muleqtest(day,None,10000000)
12.287065029144287
>>> muleqtest(days.Mo,days.Mo,10000000)
15.250630140304565
>>> muleqtest(days.Mo,days.Tu,10000000)
15.461199998855591

While this doesn't represent the most rigourous testing in the world, it does suggest that the overhead of the try-except block compares favourably with your original __cmp__ function. The if statement appears to add approximately 30% or so when comparing two enums.

Zoran Isailovski (author) 14 years, 7 months ago  # | flag

@Luke Dickens,

__nonzero__ is a great hint. Thank you! :)

Adding it to the recipe...

Zoran Isailovski (author) 14 years, 7 months ago  # | flag

Although, when I thin about it, the very first enum value is usually equivalent to zero (i.e. FALSE), so I'm not sure any more whether that change is OK.

Thinking about it... rumble... fizzle... creak... Got it!

OK, "if x" does not test for assignment, but truth value. So it's OK that bool(Days.Mo) is False, and bool(Days.Tu) is True.

If you need to test for assignment, than the "is" operator is the way to go, like here:

>>> print 'assigned' if Days.Mo is not None else 'not assigned'
assigned

Allowing for value comparison with "None" is like comparing an enum with "null" in Java - something that the compiler won't let you. If you need it more like C, just remove or comment out the assertion.

So I'm keeping the recipe as it is for now. But I'm interested in other opinions here!

Zoran Isailovski (author) 14 years, 7 months ago  # | flag

Though this is not really related to the recipe, IMO you can get performance even better by using "is" instead of "==", refraining from binding the (unused) exception value to a variable, and re-raising the exception (there is a difference between raising and re-raising):

def __cmp__(self, other):
    try:
        assert self.EnumType is other.EnumType, "Only values from the same enum are comparable"
        return cmp(self.__value, other.__value)
    except AssertionError:
        if other is None: return True
        raise

But clarity and succinctness are more important than performance, so I'd often prefer the IF variant.

Luke Dickens 13 years, 5 months ago  # | flag

@Zoran Isailovski,

Apologies for the (long) delay in replying (I forgot to subscribe to the thread).

Your solution is more elegant, and you are right that clarity comes first; closely followed by succinctness I guess.

Also, I didn't realise the difference between the is and the == operators, so I have learnt something else.

Thanks for the recipe and the comments.

jason braswell 12 years, 11 months ago  # | flag

Is there any way to change this so that type() returns a new type rather than EnumClass when called on your created enumeration?

Jeffrey Chen 12 years, 3 months ago  # | flag

I love this enum! However, we use pylint extensively in our code development, and this seems to trigger E1101 (accessing nonexistent member) since pylint cannot know about what's in __slots__. Is there anything you can suggest to circumvent this problem? I would hate to disable E1101 for all my scripts, it actually catches a good amount of actual errors.

Thanks!

Chris Johnson 11 years, 11 months ago  # | flag

Love this. One issue -- it allows the same enum value to be specified more than once, for example:

>>> x = Enum('a','b','c','a')
>>> print x
enum (a, b, c, a)
>>> x.b < x.c
True
>>> x.a < x.c
False
>>> x.c < x.a
True

By reviewing the less than / greater than behavior, it appears that the last instance of a repeated value is the one that sticks. I would argue that repeating values in the creation of an enum would almost always be an error, and so should be prevented.

Perhaps something like this (see http://www.peterbe.com/plog/uniqifiers-benchmark):

def uniquify(seq):
    # order preserving
    noDupes = []
    [noDupes.append(i) for i in seq if not noDupes.count(i)]
    return noDupes

... then at the top of Enum():

assert names == uniquify(names)

What do you think?

Zoran Isailovski (author) 11 years, 11 months ago  # | flag

It's been quite a long time ago, but I believe back in the days when I wrote this the slots assignment __slots__ = names would have raised an exception for duplicate names. Perhaps that changed with newer versions of python. I'm not sure though. As I said, it's been quite a long time. (Gee, 7 years!)

But yes, duplicate enum constants were never meant to be. It will not "almost always be an error" - it is an error. Always. It's an immanent property of enums that the enum constants are unique and have an unambiguous ordering corresponding to their declaration sequence. That property would be compromised if duplicate values were allowed.

Fortunately, meanwhile Python has evolved, and we have better means to implement the enum than in 2005. The following implementation is both simpler and safer:

from collections import namedtuple

def Enum(*names):

    EnumType = namedtuple('enum', names)

    class EnumValue(object):
        __slots__ = ('__value',)
        def __init__(self, value): self.__value = value
        Value = property(lambda self: self.__value)
        EnumType = property(lambda self: EnumType)
        def __hash__(self):        return hash(self.__value)
        def __cmp__(self, other):
            # C fans might want to remove the following assertion
            # to make all enums comparable by ordinal value {;))
            assert self.EnumType is other.EnumType, "Only values from the same enum are comparable"
            return cmp(self.__value, other.__value)
        def __invert__(self):      return constants[maximum - self.__value]
        def __nonzero__(self):     return bool(self.__value)
        def __repr__(self):        return str(names[self.__value])

    maximum = len(names) - 1
    constants = tuple(map(EnumValue, range(len(names))))
    return EnumType(*constants)

The only externally perceivable difference between this and the old implementation (besides the prevention of duplicate enum names) is in the representaion of enums. print Days would output enum(Mo=Mo, Tu=Tu, We=We, Th=Th, Fr=Fr, Sa=Sa, Su=Su), not enum (Mo, Tu, We, Th, Fr, Sa, Su).

BTW, we can derive an even simpler implementation if we don't mind seeing the integer values instead of the symbolic names. That's too C-ish for my taste, but here it goes anyway:

def Enum(*names):
    EnumType = namedtuple('enum', names)
    return EnumType(*range(len(names)))

It was nice juggling with ideas in Python again. :-)

Happy enumerating.

Zoran Isailovski (author) 11 years, 11 months ago  # | flag

@Jeffrey Chen:

That's the nature of a dynamic language, it enables to construct data structures and even programs on the fly. Obviously this recipe is based on that dynamic nature.

Lint, on the other hand, is aimed at static safety. I'm not sure about Lint's today's constraints, but if it mocks this recipe's enums, it'd also mock named tuples.

IMO the best you can do is to put well-tested reusable code in your library and exclude it from lint checks. (You don't lint the python standard library either.)

Richard 11 years, 9 months ago  # | flag

Is there a way to have the enums return arbitrary integer values instead of just an incrementing sequence?

Created by Zoran Isailovski on Fri, 6 May 2005 (PSF)
Python recipes (4591)
Zoran Isailovski's recipes (13)

Required Modules

  • (none specified)

Other Information and Tasks