ASPN ActiveState Programmer Network
ActiveState
/ Home / Perl / PHP / Python / Tcl / XSLT /
/ Safari / My ASPN /
Cookbooks | Documentation | Mailing Lists | Modules | News Feeds | Products | User Groups


Recent Messages
List Archives
About the List
List Leaders
Subscription Options

View Subscriptions
Help

View by Topic
ActiveState
.NET Framework
Open Source
Perl
PHP
Python
Tcl
Web Services
XML & XSLT

View by Category
Database
General
SOAP
System Administration
Tools
User Interfaces
Web Programming
XML Programming


MyASPN >> Mail Archive >> python-dev
python-dev
Re: [Python-Dev] When do sets shrink?
by Josiah Carlson other posts by this author
Dec 29 2005 9:51AM messages near this date
Re: [Python-Dev] When do sets shrink? | Re: [Python-Dev] When do sets shrink?
Noam Raphael <noamraph@[...].com>  wrote:
>  On 12/29/05, Fredrik Lundh <fredrik@[...].com> wrote:
>  > Noam Raphael wrote:
>  >
>  > > I'm not saying that practically it must be used - I'm just saying that
>  > > it can't be called a heuristic, and that it doesn't involve any "fancy
>  > > overkill size hinting or history tracking". It actually means
>  > > something like this:
>  > > 1. If you want to insert and the table is full, resize the table to
>  > > twice the current size.
>  > > 2. If you delete and the number of elements turns out to be less than
>  > > a quarter of the size of the table, resize the table to half of the
>  > > current size.
>  >
>  > sure sounds like a heuristic algorithm to me... (as in "not guaranteed to
>  > be optimal under all circumstances, even if it's probably quite good in all
>  > practical cases")
>  
>  I'm not saying it's optimal, but it is really amortized O(1) per
>  insert/delete. I looked up in "Introduction to Algorithms" for this,
>  and it has a complicated explanation. A simple explanation is that
>  after every resize the table is exactly half-full. Let's say it has n
>  elements and the table size is 2*n. To get to the next resize, you
>  have to do at least n/2 removals of elements, or n insertion of
>  elements. After that, you do a resize operation. In either case, you
>  do an O(n) resize operation after at least O(n) insertions/removals
>  which are O(1) operations. This means that the toal cost remains O(n)
>  per n simple operations, which you can say is O(1) per simple
>  operation.
>  
>  I hope that if you read this slowly it makes sense...

This is understood by (I would expect) most people here (hash-tables
are (theoretically and practically) average as you state, but
(theoretically) worst-case as Martin states).

For resizing, a quick-and-dirty rule of thumb is that if you are
overallocating by a factor of f(n), the amount of work you will be
performing per insertion is ~cn/f(n) (1 <= c <= 2).  As per recent
discussions on lists, Python chooses f(n) to be n/8. (at least in the
list case) This says that every insertion is taking around ~8 memory
copies if/when the list is resized up, but practical experience has
shown that it also tends to minimize memory usage as the list grows.

It's all about tradeoffs.  Increasing general memory usage for the sake
of a lower constant or not is a tradeoff.  As is resizing or not
resizing as a list gets smaller.

Would changing the overallocation strategy change Python's performance? 
Likely, but possibly not noticable.  Would it change Python memory usage?
Yes. A vast majority of list use would cause larger allocations than
currently performed by the list allocator.  Want to test it?  Create a
micro benchmark which tests repeated append performance with the two
list allocation strategies and remember to note the memory usage of
Python after each test.

 - Josiah

_______________________________________________
Python-Dev mailing list
Python-Dev@[...].org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/python-dev-ml%40activestate.c
om
Thread:
Noam Raphael
Noam Raphael
Fernando Perez
Raymond Hettinger
Raymond Hettinger
Noam Raphael
Tim Peters
Josiah Carlson
Raymond Hettinger
"Martin v. Löwis"
Fredrik Lundh
Noam Raphael
Josiah Carlson
Donovan Baarda
Fredrik Lundh
Noam Raphael
"Martin v. Löwis"
Adal Chiriliuc
1
Adal Chiriliuc
Raymond Hettinger
Donovan Baarda
Noam Raphael
Guido van Rossum
Noam Raphael
Josiah Carlson
"Martin v. Löwis"
Noam Raphael

Privacy Policy | Email Opt-out | Feedback | Syndication
© ActiveState Software Inc. All rights reserved