ASPN ActiveState Programmer Network
ActiveState
/ Home / Perl / PHP / Python / Tcl / XSLT /
/ Safari / My ASPN /
Cookbooks | Documentation | Mailing Lists | Modules | News Feeds | Products | User Groups


Recent Messages
List Archives
About the List
List Leaders
Subscription Options

View Subscriptions
Help

View by Topic
ActiveState
.NET Framework
Open Source
Perl
PHP
Python
Tcl
Web Services
XML & XSLT

View by Category
Database
General
SOAP
System Administration
Tools
User Interfaces
Web Programming
XML Programming


MyASPN >> Mail Archive >> python-list
python-list
Re: Most efficient way to "pre-grow" a list?
by Gil_johnson other posts by this author
Nov 8 2009 4:52AM messages near this date
Re: Most efficient way to "pre-grow" a list? | Re: Most efficient way to "pre-grow" a list?
On Nov 6, 8:46 pm, gil_johnson <gil_john...@[...].net>  wrote:

>  >>> arr[0] = initializer
>  >>> for i in range N:
>  >>>      arr.extend(arr)
> 
>  This doubles the array every time through the loop, and you can add
>  the powers of 2 to get the desired result.
>  Gil

To all who asked about my previous post,

I'm sorry I posted such a cryptic note before, I should have waited
until I
had the code available.

I am still new to Python, and I couldn't find a way to create an array
of
size N, with each member initialized to a given value. If there is
one,
please let me know.

I think Paul Rubin and Paul Rudin are right, if they really are 2
different
people, an array is a better solution if you're dealing with integers.
They
both mention numpy, which I know nothing about, or the array module.

kj, what *are* you going to do with this list/array?
As others have pointed out, there are differences between lists,
arrays, and
dictionaries.

The problem I was solving was this: I wanted an array of 32-bit
integers to
be used as a bit array, and I wanted it initialized with all bits set,
that
is, each member of the array had to be set to 4294967295. Of course,
you
could set your initializer to 0, or any other 32-bit number.

Originally I found that the doubling method I wrote about before was a
LOT
faster than appending the elements one at a time, and tonight I tried
the
"list = initializer * N" method. Running the code below, the doubling
method is still fastest, at least on my system.

Of course, as long as you avoid the 'one at a time' method, we're
talking
about fractions of a second, even for arrays that I think are huge,
like
the 536,870,912 byte beastie below.

[code]

# Written in Python 3.x

import array
import time

#* * * * * * * * * * * * * * * * * * * * * *
# Doubling method, run time = 0.413938045502

t0 = time.time()

newArray = array.array('I')                   # 32-bit unsigned
integers

newArray.append(4294967295)

for i in range(27):                        # 2**27 integers, 2**29
bytes
    newArray.extend(newArray)

print(time.time() - t0)

print(newArray[134217727])          # confirm array is fully
initialized

#* * * * * * * * * * * * * * * * * * * * * *
# One at a time, run time = 28.5479729176

t0 = time.time()

newArray2 = array.array('I')

for i in range(134217728):                        # the same size as
above
    newArray2.append(4294967295)

print(time.time() - t0)

print(newArray2[134217727])                      # confirm array

#* * * * * * * * * * * * * * * * * * * * * *
# List with "*", run time = 1.06160402298

t0 = time.time()

newList = [4294967295] * 134217728

print(time.time() - t0)

print(newList[134217727])                      # confirm list

[/code]

If, instead of 134,217,728 integers, I want something different, like
100,000,000, the method I use is:

[code]

#* * * * * * * * * * * * * * * * * * * * * *
# Not a power of 2, run time = 0.752086162567

t0 = time.time()

newArray = array.array('I')

tempArray = array.array('I')

tempArray.append(4294967295)

size = 100000000

while size:                                          # chew through
'size' until it's gone
    if (size & 1):                                    # if this bit of
'size' is 1
        newArray.extend(tempArray)    # add a copy of the temp array
    size > >= 1                                     # chew off one bit
    tempArray.extend(tempArray)       # double the size of the temp
array

print(time.time() - t0)

print(newArray[99999999])

#* * * * * * * * * * * * * * * * * * * * * *
# # Not a power of 2, list with "*", run time = 1.19271993637

t0 = time.time()

newList = [4294967295] * 100000000

print(time.time() - t0)

print(newList[99999999])

[/code]

I think it is interesting that the shorter list takes longer than the
one
that is a power of 2 in length. I think this may say that the "list =
initializer * N" method uses something similar to the doubling method.

Also, tempArray (above) gets reallocated frequently, and demonstrates
that reallocation is not a big problem.

Finally, I just looked into calling C functions, and found
PyMem_Malloc,
PyMem_Realloc, PyMem_Free, etc. in the Memory Management section of
the
Python/C API Reference Manual. This gives you uninitialized memory,
and
should be really fast, but it's 6:45 AM here, and I don't have the
energy
to try it.

Gil
-- 
http://mail.python.org/mailman/listinfo/python-list
Thread:
Kj
Andre Engels
Steven D'Aprano
Francis Carr
Gil_johnson
Gabriel Genellina
Gb345
Gil_johnson
Gil_johnson
Paul Rudin
Carl Banks
Emile van Sebille
exarkun
Sturlamolden
Sturlamolden
Sturlamolden
Kj
Ivan Illarionov
Steven D'Aprano
Terry Reedy
R
Gil_johnson
Kj
Bruno Desthuilliers
Luis Alberto Zarrabeitia Gomez
Andre Engels
Luis Alberto Zarrabeitia Gomez
Emily Rodgers
Raymond Hettinger
Brian Curtin
Paul Rubin
Jon Clements

Privacy Policy | Email Opt-out | Feedback | Syndication
© ActiveState Software Inc. All rights reserved