ASPN ActiveState Programmer Network
ActiveState
/ Home / Perl / PHP / Python / Tcl / XSLT /
/ Safari / My ASPN /
Cookbooks | Documentation | Mailing Lists | Modules | News Feeds | Products | User Groups


Recent Messages
List Archives
About the List
List Leaders
Subscription Options

View Subscriptions
Help

View by Topic
ActiveState
.NET Framework
Open Source
Perl
PHP
Python
Tcl
Web Services
XML & XSLT

View by Category
Database
General
SOAP
System Administration
Tools
User Interfaces
Web Programming
XML Programming


MyASPN >> Mail Archive >> perl-xml
perl-xml
RE: Getting XML Nodes based on attributes value
by Mark - BLS CTR Thomas other posts by this author
Sep 20 2005 6:29AM messages near this date
view in the new Beta List Site
Re: Getting XML Nodes based on attributes value | RE: Getting XML Nodes based on attributes value
& XSLT >         I tried using XML::XPath::XMLParser and XML::XPath modules to 
>  parse an xml files. Its really works good and still working well only 
>  the problem is its taking so much time and resources. The xml 
>  files are 
>  approximately 20MB each. My CPU load jump to 16 and the server almost 
>  don't want to accept keyboard input anymore.

DOM-style parsing (which XML::XPath does) reads the entire file into memory.
You would be better off using XML::Twig, which has optimizations for larger
files. It will stream to what you are interested in, then build little
mini-trees for convenient processing.

A simple example:

  my $twig = XML::Twig-> new( twig_handlers => 
         {
          'Object[@Class="Pens"]' =>  sub { $_->print }
         }
  );
  $twig-> parsefile('doc.xml');

I think you'll find this to be much faster for large files.

-- 
Mark Thomas 
Internet Systems Architect
_______________________________________
BAE SYSTEMS Information Technology 
2525 Network Place
Herndon, VA  20171  USA 




>  -----Original Message-----
>  From: perl-xml-bounces@[...].com 
>  [mailto:perl-xml-bounces@[...].com] On Behalf 
>  Of Joseph C. Bautista
>  Sent: Tuesday, September 20, 2005 7:39 AM
>  To: Joseph C. Bautista
>  Cc: perl-xml
>  Subject: Re: Getting XML Nodes based on attributes value
>  
>  Hi All,
>  
>         I tried using XML::XPath::XMLParser and XML::XPath modules to 
>  parse an xml files. Its really works good and still working well only 
>  the problem is its taking so much time and resources. The xml 
>  files are 
>  approximately 20MB each. My CPU load jump to 16 and the server almost 
>  don't want to accept keyboard input anymore.
>  
>         My server is P4 1.7GHz with 512MB RAM running in linux 
>  redhat 9.
>  
>         Is there any way to do this in more optimize and faster way?
>  
>  
>         Thank you and all sugestions/ideas are highly appreciated.
>  
>  Joseph
>  
>  >>> Hi All,
>  >>>
>  >>>    I was wondering if there's a module that i can used to 
>  get only 
>  >>> the nodes in an XML file based on the attributes value?
>  >>>
>  >>>   Example XML is:
>  >>>
>  >>>    <Things>
>  >>>       <Object Class="Pens" Location="Room">
>  >>>             <p name="length">4<\p>
>  >>>             <p name="thickness">.5</p>
>  >>>             <p name="color">yellow</p>
>  >>>       </Object>
>  >>>       <Object Class="Papers" Location="Room">
>  >>>             <p name="length">9<\p>
>  >>>             <p name="thickness">2</p>
>  >>>             <p name="color">yellow</p>
>  >>>       </Object>
>  >>>       <Object Class="Pens" Location="Sala">
>  >>>             <p name="length">4<\p>
>  >>>             <p name="thickness">.5</p>
>  >>>             <p name="color">Black</p>
>  >>>       </Object>
>  >>>       <Object Class="Papers" Location="Sala">
>  >>>             <p name="length">9<\p>
>  >>>             <p name="thickness">3</p>
>  >>>             <p name="color">white</p>
>  >>>       </Object>
>  >>>   </Things>
>  >>>
>  >>>    If I want to retrieve only the objects with class 
>  "Pens" then i 
>  >>> would just execute something like
>  >>>
>  >>>          $xml->node("Object", "Class" => "Pens")
>  >>>      and the module(s) will give me a hashref with two 
>  values. All 
>  >>> objects, including its parameters, with class "pens" (1 in "Sala" 
>  >>> and 1 in "Room")?
>  >>>
>  >>>
>  >>>    Thanks...
>  >>>
>  >>> Br,
>  >>> Joseph
>  >>>
>  >>>
>  >>>
>  >>
>  >> _______________________________________________
>  >> Perl-XML mailing list
>  >> Perl-XML@[...].com
>  >> To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
>  >>
>  >
>  >
>  
>  
>  -- 
>  This message has been scanned for viruses and dangerous 
>  content by host-center.net and is believed to be clean.
>  
>  _______________________________________________
>  Perl-XML mailing list
>  Perl-XML@[...].com
>  To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
>  
>  

_______________________________________________
Perl-XML mailing list
Perl-XML@[...].com
To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
Thread:
Joseph C. Bautista
Tod Harter
Mark - BLS CTR Thomas
Andrew Strader
Joseph C. Bautista
Joseph C. Bautista
Petr Pajas
Robin Berjon
Merijn van den Kroonenberg

Privacy Policy | Email Opt-out | Feedback | Syndication
© ActiveState Software Inc. All rights reserved