ASPN ActiveState Programmer Network
ActiveState
/ Home / Perl / PHP / Python / Tcl / XSLT /
/ Safari / My ASPN /
Cookbooks | Documentation | Mailing Lists | Modules | News Feeds | Products | User Groups


Recent Messages
List Archives
About the List
List Leaders
Subscription Options

View Subscriptions
Help

View by Topic
ActiveState
.NET Framework
Open Source
Perl
PHP
Python
Tcl
Web Services
XML & XSLT

View by Category
Database
General
SOAP
System Administration
Tools
User Interfaces
Web Programming
XML Programming


MyASPN >> Mail Archive >> python-list
python-list
Re: Surprising (for me) benchmark results...
by John J. Lee other posts by this author
May 2 2001 1:14PM messages near this date
SOAP.py 0.9 | Re: Surprising (for me) benchmark results...
On Tue, 1 May 2001, Daniel Berlin wrote:
>  On Wed, 2 May 2001, Rainer Deyke wrote:
>  > "Courageous" <jkraska1@[...].com> wrote in message
>  > news:rrouet0f4un3h3eo8pq6nlk5idgkmhhhca@[...]..
>  > > On Wed, 02 May 2001 01:11:50 GMT, "Rainer Deyke" <root@[...].com>
>  > wrote:
>  > > >"Daniel Berlin" <dan@[...].com> wrote in message
>  > > >news:mailman.988762570.8591.python-list@[...]..
>  > > >> Actually, the sort is probably not the bottleneck.
>  > > >> As the file gets larger, the bottleneck becomes disk time, not sort
>  > time.
>  > > >Disk access is O(N); sorting is typically O(N log N).  Therefore as the
[...]
>  > > Disregarding the algorithm and this specific situation in question, you
>  > are
>  > > both correct; this is because while disk access may scale O(N), this
>  > > scaling is multiplied by a huge constant factor that can in many cases
>  > > dwarf the sorting algorithm, depending on the number of entries and
>  > > the like.

The best answer, of course.  Mind you, I don't know whether the curves
*ever* cross for feasible N.  Does anyone here know, though direct
experience or calculation?  The other thing is that for small changes in
N, disk access doesn't necessarily go as N (caches, that sort of thing?),
and maybe sorting doesn't go exactly as N ln N.

>  > I was responding to this: "As the file gets larger, the bottleneck becomes
>  > disk time, not sort time."  This is just plain incorrect (assuming larger
>  > file = more entries, not longer entries).
>  No, it's not.
>  You still have to get your damn file off the disk.
>  He was timing a sorting program in it's entirety, not the sort.
>  Therefore, the disk i/o time matters. And with it taking such a short time
>  for each program run, it could possibly matter a *whole lot*.
>  > If disk access dominates  on large
>  > files, it dominates even more on small files.
>  Err, that doesn't make it not dominate on small files.

And if it (disk access) dominates on small files, then it dominates less
on large files, until eventually the sort dominates.  Mm?

To repeat what you said again:

>  > > >> As the file gets larger, the bottleneck becomes disk time, not sort
>  > time.

(you can't get any more smug and annoying than a good 'Mm?', can you? :)


John

-- 
http://mail.python.org/mailman/listinfo/python-list
Thread:
John J. Lee
Daniel Berln
John J. Lee
Brian Quinlan
John J. Lee
Daniel Berlin
John J. Lee

Privacy Policy | Email Opt-out | Feedback | Syndication
© ActiveState Software Inc. All rights reserved