ASPN ActiveState Programmer Network
ActiveState
/ Home / Perl / PHP / Python / Tcl / XSLT /
/ Safari / My ASPN /
Cookbooks | Documentation | Mailing Lists | Modules | News Feeds | Products | User Groups


Recent Messages
List Archives
About the List
List Leaders
Subscription Options

View Subscriptions
Help

View by Topic
ActiveState
.NET Framework
Open Source
Perl
PHP
Python
Tcl
Web Services
XML & XSLT

View by Category
Database
General
SOAP
System Administration
Tools
User Interfaces
Web Programming
XML Programming


MyASPN >> Mail Archive >> xsl-list
xsl-list
RE: [xsl] Stylesheet Optimization -- How to Make It Faster
by Michael Kay other posts by this author
Nov 28 2006 1:14AM messages near this date
[xsl] Stylesheet Optimization -- How to Make It Faster | Re: [xsl] Stylesheet Optimization -- How to Make It Faster
& XSLT (a) It would be a nice courtesy if you could lay out the code so that we can read it.

(b) What XSLT processor are you using?

(c) The most obvious inefficiency is here:
    expand="{$abbreviations[.=$abbr]/following-sibling::expanded}"
    This would benefit from use of keys.

Michael Kay
http://www.saxonica.com/
 

>  -----Original Message-----
>  From: Jeff Sese [mailto:jsese@[...].com] 
>  Sent: 28 November 2006 01:41
>  To: Xsl-List
>  Subject: [xsl] Stylesheet Optimization -- How to Make It Faster
>  
>  I have a stylesheet that puts mark-up to text nodes that 
>  matches an abbreviation in a reference xml file. Its working 
>  nicely but the processing time is very slow... i'm guessing 
>  because its processing text nodes. A 800kb file takes me 
>  about 25 mins to process and i have around 800 file to 
>  process (varying file sizes, some are relatively small and 
>  some are fairly large). Is there any way to optimize my 
>  stylesheet so that it can process the files faster?
>  
>  here is my stylesheet:
>  
>  <?xml version="1.0" encoding="UTF-8"?>
>  <xsl:stylesheet version="2.0" 
>  xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
>  xmlns:xs="http://www.w3.org/2001/XMLSchema" 
>  xmlns:ati="http://www.asiatype.com/xslt-functions" 
>  exclude-result-prefixes="xs ati">
>  <xsl:output method="xml" version="1.0" encoding="UTF-8"/> 
>  <xsl:variable name="abbreviations" as="element()+" 
>  select="document('publishers_data.xml')/root/publisher/abbrev"/>
>  <xsl:template match="/">
>  <xsl:apply-templates/>
>  </xsl:template>
>  <xsl:template match="text()[ancestor::ab and 
>  not(ancestor::note[@id and @n and @lang])]"> <xsl:variable 
>  name="str" as="xs:string" select="."/> <xsl:choose> <xsl:when 
>  test="exists($abbreviations[matches($str,concat('(^|\W)(',ati:
escape(.),')($|\W)'))])"> 
>  <xsl:variable name="search-str" as="xs:string+" 
>  select="$abbreviations[matches($str,concat('(^|\W)(',ati:escap
e(.),')($|\W)'))]"/> 
>  <xsl:variable name="replace" as="element()*"> <xsl:for-each 
>  select="$search-str"> <xsl:variable name="abbr" 
>  as="xs:string" select="."/> <abbr type="title" 
>  expand="{$abbreviations[.=$abbr]/following-sibling::expanded}"
> <xsl:value-of
>  select="$abbr"/></abbr>
>  </xsl:for-each>
>  </xsl:variable>
>  <xsl:sequence select="ati:replace-with-nodes($str, 
>  $search-str, $replace)"/> </xsl:when> <xsl:otherwise> 
>  <xsl:value-of select="$str"/> </xsl:otherwise> </xsl:choose> 
>  </xsl:template> <xsl:template 
>  match="@*|element()|comment()|processing-instruction()" 
>  mode="#all">
>  <xsl:copy>
>  <xsl:apply-templates select="@*|node()"/> </xsl:copy> 
>  </xsl:template> <xsl:function name="ati:replace-with-nodes" 
>  as="node()+"> <xsl:param name="input" as="xs:string"/> 
>  <xsl:param name="words-to-replace" as="xs:string*"/> 
>  <xsl:param name="replacement" as="node()*"/> <xsl:variable 
>  name="regex" select="string-join(for $w in $words-to-replace 
>  return concat('(', ati:escape($w), ')'),'|')"/> 
>  <xsl:analyze-string select="$input" regex="{$regex}"> 
>  <xsl:matching-substring> <xsl:variable name="i" 
>  as="xs:integer" select="(1 to 
>  count($words-to-replace))[regex-group(.)]"/>
>  <xsl:sequence select="$replacement[$i]"/> 
>  </xsl:matching-substring> <xsl:non-matching-substring> 
>  <xsl:value-of select="."/> </xsl:non-matching-substring> 
>  </xsl:analyze-string> </xsl:function> <xsl:function 
>  name="ati:escape"> <xsl:param name="s" as="xs:string"/> 
>  <xsl:sequence 
>  select="replace($s,'[\\\|\.\-\^\?\*\+\(\)\{\}\[\]\$]','\\$0')"/>
>  </xsl:function>
>  </xsl:stylesheet>
>  
>  heres a short version of the publishers_data.xml:
>  
>  <root>
>  <publisher>
>  <abbrev>Inschriften von Priene</abbrev>
>  <expanded>Inschriften von Priene</expanded> </publisher> 
>  <publisher> <abbrev>P. Mil. Congr. XVIII</abbrev> 
>  <expanded>Papiri documentari dell'UniversitàCattolica di 
>  Milano</expanded> </publisher> <publisher> <abbrev>P. Jud. 
>  Des. Misc.</abbrev> <expanded>Discoveries in the Judean 
>  Desert XXXVIII</expanded> </publisher>
>  <!-- more publishers here -->
>  </root>
>  
>  heres a snippet of the source xml:
>  
>  <!-- preceding::node() of ab -->
>  <ab lang="grk" n="1">
>  <foreign lang="grk">Î? γέγονε καÏ?á½° Ï?οὺÏ? Î?αρείοÏ?</foreign> 
>  <note place="margin">a c</note> <lb n="5"/> <foreign 
>  lang="grk">Ï?ρόνοÏ?Ï? Ï?οῦ μεÏ?á½° Î?αμβύÏ?ην βαÏ?ιλεύÏ?ανÏ?οÏ?
, á½?Ï?ε καὶ 
>  Î?ιονύÏ?ιοÏ? ἦν ὁ Î?ιλήÏ?ιοÏ?</foreign> <lb/>(III), <foreign 
>  lang="grk">ἐÏ?á½¶ Ï?á¿?Ï? ξ¯ε¯ á½?λÏ?μÏ?ιάδοÏ?</foreign> (520/16)<foreign 
>  lang="grk">Î? á¼±Ï?Ï?οριογράÏ?οÏ?. ῾Î?ρόδοÏ?οÏ? δὲ ὁ ῾Î?λι-</for
eign>  
>  <note place="margin">v</note> <lb/> <foreign 
>  lang="grk">καρναÏ?εὺÏ? á½ Ï?έληÏ?αι Ï?ούÏ?οÏ?, νεώÏ?εροÏ? ὤν. Î
ºÎ±á½¶ ἦν 
>  á¼?κοÏ?Ï?Ï?á½´Ï? ΠρÏ?Ï?αγόροÏ?</foreign> <note id="n7" n="7" lang="ger"> 
>  <foreign lang="grk">ὤνÎ? γέγονε γὰρ μεÏ?á¾½ αὐÏ?όν</foreign> 
>  A</note> <lb/> <foreign lang="grk">ὁ ῾Î?καÏ?αá¿?οÏ?. Ï?ρῶÏ?οÏ? δὲ 
>  á¼±Ï?Ï?ορίαν Ï?εζῶÏ? ἐξήνεγκε, Ï?Ï?γγραÏ?ὴν δὲ Φερεκύ
δηÏ?</foreign>  
>  <note id="n8â??9" n="8â??9" lang="ger"> <foreign 
>  lang="grk">Ï?ρῶÏ?οÏ?â??νοθεύεÏ?αι</foreign> wiederholt s. <foreign 
>  lang="grk">á½¶Ï?Ï?ορá¿?Ï?αι</foreign>, s. <foreign 
>  lang="grk">Ï?Ï?γγραÏ?εá¿?Ï?</foreign>.</note>
>  <lb/>(I 3). <foreign lang="grk">Ï?á½° γὰρ á¾½Î?κοÏ?Ï?ιλάοÏ?</foreign> 
>  (<link type="boj" targets="a002" n="BOJTEXT002_T_7">2 T 
>  7</link>) <foreign lang="grk">νοθεύεÏ?αι.</foreign> <note 
>  id="n9" n="9" lang="ger"> <foreign 
>  lang="grk">á¾½Î?κοÏ?Ï?ιλάοÏ?</foreign> Vossius <foreign 
>  lang="grk">á¾½Î?γηÏ?ιλάοÏ?</foreign> Suid</note> </ab>
>  <!-- following::node() of ab -->
>  
>  all: ab nodes appear in the same level (same depth) though out.
>  
>  Any suggestions are welcome.
>  
>  Thanks,
>  --
>  Jeff
>  
>  --~------------------------------------------------------------------
>  XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
>  To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
>  or e-mail: <mailto:xsl-list-unsubscribe@[...].com>
>  --~--
>  


--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe@[...].com> 
--~--
Thread:
Jeff Sese
Michael Kay
David Carlisle
Jeff Sese

Privacy Policy | Email Opt-out | Feedback | Syndication
© ActiveState Software Inc. All rights reserved