ASPN ActiveState Programmer Network
ActiveState
/ Home / Perl / PHP / Python / Tcl / XSLT /
/ Safari / My ASPN /
Cookbooks | Documentation | Mailing Lists | Modules | News Feeds | Products | User Groups


Recent Messages
List Archives
About the List
List Leaders
Subscription Options

View Subscriptions
Help

View by Topic
ActiveState
.NET Framework
Open Source
Perl
PHP
Python
Tcl
Web Services
XML & XSLT

View by Category
Database
General
SOAP
System Administration
Tools
User Interfaces
Web Programming
XML Programming


MyASPN >> Mail Archive >> xsl-list
xsl-list
[xsl] A Functional Tokenizer (Was: Re: Looping over a CSV in XSL)
by Dimitre Novatchev other posts by this author
Nov 20 2001 5:03AM messages near this date
Re: [xsl] filtering on following-sibling axis | [xsl] [ANN] Anywhere to Anywhere data integration
>  Now in XSL land I want to iterate over a
>  nodelist and compare some attribute of the current node to each value in the
>  CSV for equality.

You have a CSV string (a list of characters), you need to inspect every character
and to gradually accumulate the result -- a list of words, that were delimited by
special characters (in this particular case by comma and/or white space).

A "generic accumulator" function over the elements of a list is the "foldl" function
-- the classic king of generic list processing. We pass to "foldl" as parameter a
function that will be called with two arguments -- the acuumulated result until now
(the list of tokens so far) and the next character in the input string.
Based on these two arguments, this function updates the accumulated result
appropriately -- it either appends the character to the last token, or "cuts" the
last token and starts a new one.

And here's the solution:

<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:msxsl="urn:schemas-microsoft-com:xslt"
xmlns:str-split2words-func="f:str-split2words-func"
exclude-result-prefixes="xsl msxsl str-split2words-func"
> 

   <xsl:import href="str-foldl.xsl"/> 

   <str-split2words-func:str-split2words-func/> 

   <xsl:param name="pDelimiters" select="', &#9;&#10;&#13;'"/> 

   <xsl:output indent="yes" omit-xml-declaration="yes"/> 
   
    <xsl:template match="/"> 
      <xsl:call-template name="str-split-to-words"> 
        <xsl:with-param name="pStr" select="/*/*"/> 
      </xsl:call-template> 
    </xsl:template> 

    <xsl:template name="str-split-to-words"> 
      <xsl:param name="pStr" select="dummy"/> 
      
      <xsl:variable name="vsplit2wordsFun"
                    select="document('')/*/str-split2words-func:*[1]"/> 

      <xsl:call-template name="str-foldl"> 
        <xsl:with-param name="pFunc" select="$vsplit2wordsFun"/> 
        <xsl:with-param name="pStr" select="$pStr"/> 
        <xsl:with-param name="pA0" select="/.."/> 
      </xsl:call-template> 

    </xsl:template> 

    <xsl:template match="str-split2words-func:*"> 
      <xsl:param name="arg1" select="/.."/> 
      <xsl:param name="arg2"/> 
         
      <xsl:choose> 
        <xsl:when test="contains($pDelimiters, $arg2)"> 
            <xsl:copy-of select="$arg1/*"/> 
            <xsl:if test="string($arg1/*[last()])"> 
              <word/> 
            </xsl:if> 
        </xsl:when> 
        <xsl:otherwise> 
          <xsl:copy-of select="$arg1/*[position() &lt; last()]"/> 
          <word> <xsl:value-of select="concat($arg1/*[last()], $arg2)"/></word>
        </xsl:otherwise> 
      </xsl:choose> 
    </xsl:template> 

</xsl:stylesheet> 

When applied on the following xml document:

<contents> 
  <csv> Fredrick, Aaron, john, peter</csv>
</contents> 

The result is:

<word> Fredrick</word><word>Aaron</word><word>john</word><word>peter</word>

We need just one more small step in order to obtain the ultimate tokenizer -- if we
manage to pass the list of delimiters to the accumulating function that we pass as
parameter to str-foldl, then we have the most general tokenizer function. You'll
never anymore need to code your own tokenizer, just call this one with your
parameters.

The solution is to always specify the list of delimiters as the first element of the
"accumulator" list:

<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:msxsl="urn:schemas-microsoft-com:xslt"
xmlns:str-split2words-func="f:str-split2words-func"
exclude-result-prefixes="xsl msxsl str-split2words-func"
> 

   <xsl:import href="str-foldl.xsl"/> 

   <str-split2words-func:str-split2words-func/> 

   <xsl:output indent="yes" omit-xml-declaration="yes"/> 
   
    <xsl:template match="/"> 
      <xsl:call-template name="str-split-to-words"> 
        <xsl:with-param name="pStr" select="/*/*"/> 
        <xsl:with-param name="pDelimiters" select="', &#9;&#10;&#13;'"/> 
      </xsl:call-template> 
    </xsl:template> 

    <xsl:template name="str-split-to-words"> 
      <xsl:param name="pStr"/> 
      <xsl:param name="pDelimiters"/> 
      
      <xsl:variable name="vsplit2wordsFun"
                    select="document('')/*/str-split2words-func:*[1]"/> 
                    
      <xsl:variable name="vrtfParams"> 
       <delimiters> <xsl:value-of select="$pDelimiters"/></delimiters>
      </xsl:variable> 

      <xsl:variable name="vResult"> 
	      <xsl:call-template name="str-foldl"> 
	        <xsl:with-param name="pFunc" select="$vsplit2wordsFun"/> 
	        <xsl:with-param name="pStr" select="$pStr"/> 
	        <xsl:with-param name="pA0" select="msxsl:node-set($vrtfParams)"/> 
	      </xsl:call-template> 
      </xsl:variable> 
      
      <xsl:copy-of select="msxsl:node-set($vResult)/word"/> 

    </xsl:template> 

    <xsl:template match="str-split2words-func:*"> 
      <xsl:param name="arg1" select="/.."/> 
      <xsl:param name="arg2"/> 
         
      <xsl:copy-of select="$arg1/*[1]"/> 
      <xsl:copy-of select="$arg1/word[position() != last()]"/> 
      
      <xsl:choose> 
        <xsl:when test="contains($arg1/*[1], $arg2)"> 
          <xsl:if test="string($arg1/word[last()])"> 
             <xsl:copy-of select="$arg1/word[last()]"/> 
          </xsl:if> 
          <word/> 
        </xsl:when> 
        <xsl:otherwise> 
          <word> <xsl:value-of select="concat($arg1/word[last()], $arg2)"/></word>
        </xsl:otherwise> 
      </xsl:choose> 
    </xsl:template> 

</xsl:stylesheet> 

And with the same xml source document, here's the result:

<word> Fredrick</word>
<word> Aaron</word>
<word> john</word>
<word> peter</word>

Hope this helped.

Cheers,
Dimitre Novatchev.



__________________________________________________
Do You Yahoo!?
Yahoo! GeoCities - quick and easy web site hosting, just $8.95/month.
http://geocities.yahoo.com/ps/info1

 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list

Privacy Policy | Email Opt-out | Feedback | Syndication
© ActiveState Software Inc. All rights reserved