ASPN ActiveState Programmer Network
ActiveState
/ Home / Perl / PHP / Python / Tcl / XSLT /
/ Safari / My ASPN /
Cookbooks | Documentation | Mailing Lists | Modules | News Feeds | Products | User Groups


Recent Messages
List Archives
About the List
List Leaders
Subscription Options

View Subscriptions
Help

View by Topic
ActiveState
.NET Framework
Open Source
Perl
PHP
Python
Tcl
Web Services
XML & XSLT

View by Category
Database
General
SOAP
System Administration
Tools
User Interfaces
Web Programming
XML Programming


MyASPN >> Mail Archive >> xml-sig
xml-sig
[XML-SIG] Re: Re: Re: Re: cElementTree 0.8 (january 11, 2005)
by Fredrik Lundh other posts by this author
Jan 15 2005 6:09AM messages near this date
[XML-SIG] Re: XML for scientific data storage and search | [XML-SIG] ANN: cElementTree 0.9.2 (january 15, 2005)
& XSLT Daniel Veillard wrote:

>   You have a python function calling a native function. That function returns
>  a string. That C string is translated to a Python string by the wrapper
>  using PyString_FromString(). That operation seems to be extremely expensive.

PyString basically boils down to:

    determine the length of the string
    call fast allocator
    copy string to area allocated by fast allocator

for UTF-8 data, the steps are:

    determine maximum possible length of the string
    call fast allocator
    copy string to area allocated by fast allocator, character
        by character.  handle UTF-8 code sequences.
    adjust size of allocated area, if necessary

cElementTree has to do all this for all strings in the document, of course, and
the time it takes is included in my parsing benchmark.  and I guess libxml2 is
doing something very similar, but using your own allocator and object layout.

but parsing is one thing, using the data from Python code is another.  to return
data to Python, all cElementTree has to do (in the normal case) is to return the
string object it created during the parse.  that's a pointer copy, not a buffer
copy.

libxml2, in contrast, has to copy the strings once again, using Python's allocator
and Python's string object layout.  and if you don't cache stuff, you end up doing
this every time someone accesses a node...

</F>  



_______________________________________________
XML-SIG maillist  -  XML-SIG@[...].org
http://mail.python.org/mailman/listinfo/xml-sig

Privacy Policy | Email Opt-out | Feedback | Syndication
© ActiveState Software Inc. All rights reserved