ASPN ActiveState Programmer Network
ActiveState
/ Home / Perl / PHP / Python / Tcl / XSLT /
/ Safari / My ASPN /
Cookbooks | Documentation | Mailing Lists | Modules | News Feeds | Products | User Groups


Recent Messages
List Archives
About the List
List Leaders
Subscription Options

View Subscriptions
Help

View by Topic
ActiveState
.NET Framework
Open Source
Perl
PHP
Python
Tcl
Web Services
XML & XSLT

View by Category
Database
General
SOAP
System Administration
Tools
User Interfaces
Web Programming
XML Programming


MyASPN >> Mail Archive >> xsl-list
xsl-list
[xsl] performance advice sought
by Mike Castle other posts by this author
Dec 11 2003 11:36PM messages near this date
RE: [xsl] Different templates for same set of nodes | [xsl] Re: performance advice sought
XSL newbie.

It's quite possible that this particular task is better suited to something
besides XSL, but since I'm trying to plug this into ant as part of our
build & test system, and ant comes with a handy dandy <xslt/>  task, I
figured it was a good place to start.

The solution I have come up with works.  It's logically correct.  However,
it'd damned slow.

Anyway, the task:

Our test system generates four files for each test run: test log
proper, test engine log, server log, and the server wrapper log (see
http://wrapper.sf.net).  Now, the first 3 files all use the same route
to write out infomation using an XML like <LogRecord/> , while the latter
is a plain text file.

What I need to do is analyze an aggregation of those files for each run.

* If a test engine log or server log has a <LogRecord/>  with a severity
level of FATAL or MFATAL, or if the string "Exception" shows up in the
text of any <LogRecord/>  or the server wrapper file, the whole run is
a bust, and we count all of the associated tests as failures.

*  For all of the other runs, we look at the the <LogRecord/> s in the test
log proper, and if any of those lines has a severity level of FATAL or
MFATAL or the string "Exception" in the text, then a particular test is
determined to have failed. (There may be multiple FATAL|MFATAL|Exceptions
for a particular test.)

* Sum up all of the tests ran, tests failed due to system errors, and
individual test failures.  For extra credit, I've been adding number
of known tests to pass just to make sure all of the numbers add up.
I'm paranoid like that.

* For each test failing with a system error, print out what test suite
failed.

* For each individual test that failed, print out what specific test
failed.

So really it's less of a transformation and more of a summary report.
Which is why I'm not certain that XSLT is the right tool for this.

What I do is creat a big XML wrapper using lots of entities to wrap all
of the pseudo XML fragments (and the plain text file comes in as a big
CDATA section as a <LogRecord/>  element).

When it's all said and done, the XML looks like this:

<?xml version="1.0"?> 
<Log> 
 <Test name="testsuite1"> 
  <TestLog> 
   <LogRecord severity="STATUS"> Beginning test1 pacakge Test Suite 1 script....</LogRecord>
   ...
   <LogRecord severity="STATUS"> good stuff is happening here</LogRecord>
  </TestLog> 
  <TEngine> 
   <LogRecord severity="STATUS"> Test Engine started...</LogRecord>
   ...
   <LogRecord severity="STATUS"> Test Engine stopped.</LogRecord>
  </TEngine> 
  <Server> 
   <LogRecord severity="STATUS"> Server started...</LogRecord>
   ...
   <LogRecord severity="STATUS"> Server shut down.</LogRecord>
  </Server> 
  <Wrapper> 
   <LogRecord> <![CDATA[several lines of text here]]></LogRecord>
 </Test> 
 ...
</Log> 

Now, since each of these logs is really essentially a flat file, I have
to mechanically determine logical break points in the logs.  The break
points are determined by a <LogRecord/>  where the text starts with the
string "Beginning" and has the string "scripts..." in it (Yeah, I know,
it's ugly).

So the style sheet I'm currently using is:

<?xml version="1.0" encoding="UTF-8"?> 
<!DOCTYPE xsl:stylesheet [
<!ENTITY recordpredicate "@severity='MFATAL' or @severity='FATAL' or contains(text(),'Except
ion')"> 
<!ENTITY recordfailure "LogRecord[&recordpredicate;]"> 
<!ENTITY systemcheck "child::*[self::TEngine or self::Server or self::Wrapper]"> 
<!ENTITY systemfailure "&systemcheck;/&recordfailure;"> 
<!ENTITY idtestrecord "[contains(substring-after(text(),'Beginning'),'scripts...')]"> 
<!ENTITY testscriptsid "TestLog/LogRecord&idtestrecord;"> 
<!ENTITY testname "normalize-space(substring-before(substring-after(.,'Beginning'),'package'
))"> 
]> 
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> 
  <xsl:output method="html" encoding="UTF-8" indent="yes"/> 
  <xsl:template match="/"> 
    <xsl:text> Test results:</xsl:text>
        <xsl:variable name="all-tests" select="/Log/Test/&testscriptsid;"/> 
        <xsl:variable name="system-failed-tests" select="/Log/Test/&systemfailure;/../../&te
stscriptsid;"/> 
        <xsl:variable name="not-system-failed-tests" select="$all-tests[count(.|$system-fail
ed-tests)!=count($system-failed-tests)]"/> 
        <xsl:variable name="individual-failed-tests" select="$not-system-failed-tests/../&re
cordfailure;/preceding-sibling::*&idtestrecord;[position()=1]"/> 
        <xsl:variable name="not-individual-failed-tests" select="$not-system-failed-tests[co
unt(.|$individual-failed-tests)!=count($individual-failed-tests)]"/> 

        There were <xsl:number value="count($all-tests)"/>  tests ran.
        There were <xsl:number value="count($system-failed-tests)"/>  tests that failed at a 
system level failure.
        There were <xsl:number value="count($individual-failed-tests)"/>  individual tests th
at failed.
        There were <xsl:number value="count($not-individual-failed-tests)"/>  tests known to 
pass.
        <xsl:text> 
System level failures:
</xsl:text> 
        <xsl:for-each select="$system-failed-tests/../../@name"> 
          <xsl:value-of select="."/> 
          <xsl:text> 
</xsl:text> 
        </xsl:for-each> 
        <xsl:text> 
 Which caused the following tests to fail:
</xsl:text> 
        <xsl:for-each select="$system-failed-tests"> 
          <xsl:value-of select="../../@name"/>  : <xsl:value-of select="&testname;"/>
          <xsl:text> 
</xsl:text> 
        </xsl:for-each> 
        <xsl:text> 
Individual failures:
</xsl:text> 
        <xsl:for-each select="$individual-failed-tests"> 
          <xsl:value-of select="../../@name"/>  : <xsl:value-of select="&testname;"/>
          <xsl:text> 
</xsl:text> 
        </xsl:for-each> 
        <xsl:text> 
Passed tests:
</xsl:text> 
        <xsl:for-each select="$not-individual-failed-tests"> 
          <xsl:value-of select="../../@name"/>  : <xsl:value-of select="&testname;"/>
          <xsl:text> 
</xsl:text> 
        </xsl:for-each> 
  </xsl:template> 
</xsl:stylesheet> 

Ok, actually for each test with a system failure, I'm listing the
individual tests as well.  The whole "Beginning ... package ... scripts..."
thing is quite fragile, I know, and we wanted to make sure we had
everything typed in correctly in the test scripts.

On the smaller test runs, this style sheet performs quite well.  Takes
about 30 seconds to generate the report.  Everyone is happy.

However, on a larger set of runs, it takes over 1 hour to generate the
report.  Everyone is unhappy.

Well, tracking down the culprit seems to be this particular query:

<xsl:variable name="individual-failed-tests"
  select="$not-system-failed-tests/../&recordfailure;/preceding-sibling::*&idtestrecord;[pos
ition()=1]"/> 

Which, in after thought, makes sense.  It's pretty much an O(MxN), as for
every FATAL|MFATAL|Exception it finds, it then scans backwards looking for
the magic strings.  And in the one particular file that's giving me issues,
there are 251789 LogRecords, 83 failing records and 102 tests.  So out of
250k of records, I'm really only interested in 200!

So, what can I do to speed up that process?

One process I considered was doing a two step process:  First would pull
out the records I'm interested in (essentially acting like a structured
grep) and then generating the summary against that.

I wish I knew how to do a previous-sibling to work against a smaller subset
rather than the whole tree.

Any advice on how to make this a bit more effecient?

Thanks!
mrc
-- 
     Mike Castle      dalgoda@[...].com      www.netcom.com/~dalgoda/
    We are all of us living in the shadow of Manhattan.  -- Watchmen
fatal ("You are in a maze of twisty compiler features, all different"); -- gcc

 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
Thread:
Mike Castle
Dimitre Novatchev
Mike Castle

Privacy Policy | Email Opt-out | Feedback | Syndication
© ActiveState Software Inc. All rights reserved