ASPN ActiveState Programmer Network  
ActiveState, a division of Sophos
/ Home / Perl / PHP / Python / Tcl / XSLT /
/ Safari / My ASPN /
Cookbooks | Documentation | Mailing Lists | Modules | News Feeds | Products | User Groups
Submit Recipe
My Recipes

All Recipes
All Cookbooks


View by Category

Title: Tokenize a string
Submitter: Mats Kindahl (other recipes)
Last Updated: 2002/09/11
Version no: 1.0
Category: String manipulation

 

Not Rated yet


Description:

How to tokenize a string by separating it at a any of several characters and process each token individually.

Source: Text Source

<xsl:template name="tokenize">
  <xsl:param name="str"/><!-- The string to process -->
  <xsl:param name="sep"/><!-- String containing legal token separators -->

  <xsl:variable name="rss">
    <xsl:call-template name="repeat-string">
      <xsl:with-param name="str" select="':'"/>
      <xsl:with-param name="cnt" select="string-length($sep)"/>
    </xsl:call-template>
  </xsl:variable>

  <xsl:call-template name="tokenize-1">
    <xsl:with-param name="pat" select="translate($str,$sep,$rss)"/>
  </xsl:call-template>
</xsl:template>

<xsl:template name="tokenize-1">
  <xsl:param name="pat"/><!-- String with record separators inserted -->
  <xsl:choose>
    <xsl:when test="contains($pat,':')">
      <xsl:call-template name="process-token">
	<xsl:with-param name="token" select="substring-before($pat,':')"/>
      </xsl:call-template>
      <xsl:call-template name="tokenize-1">
	<xsl:with-param name="pat" select="substring-after($pat,':')"/>
      </xsl:call-template>
    </xsl:when>
    <xsl:otherwise>
      <xsl:call-template name="process-token">
	<xsl:with-param name="token" select="$pat"/>
      </xsl:call-template>
    </xsl:otherwise>
  </xsl:choose>
</xsl:template>

<!-- Example of how to use the template above -->

<xsl:template name="process-token">
  <xsl:param name="token"/>
  <xsl:text>Token: &quot;</xsl:text>
  <xsl:value-of select="$token"/>
  <xsl:text>&quot;&#10;</xsl:text>
</xsl:template>

<xsl:template match="split">
  <xsl:call-template name="tokenize">
    <xsl:with-param name="str" select="."/>
    <xsl:with-param name="sep" select="@at"/>
  </xsl:call-template>
</xsl:template>

The license for this recipe is available here.

Discussion:

If given the XML element

foo,bar,whatever;;yes

the code above will produce the text

Token: "foo"
Token: "bar"
Token: "whatever"
Token: ""
Token: "yes"

Observe that there are several different separator characters. If there is only one character, a simple solution utilizing substring-before and substring-after would suffice. The behaviour is similar to the 'strtok' C function or the 'split' function (with a character class as first argument) of Perl.

This solution has the drawback that you can only do one kind of processing within a stylesheet: the "name" argument to "call-template" has to be a qualified name and it does not accept an attribute value template (e.g., "{$call}").



Add comment

Number of comments: 1

exslt, Wolfgang Werner, 2004/09/30
There is also an EXSLT function providing the same feature: str:tokenize (http://www.exslt.org/str/functions/tokenize/str.tokenize.html)
Be sure to use an EXSLT compliant processor, like xsltproc or 4XSLT. Here's an example of how i used it:

<!-- XSLT -->
<xsl:stylesheet version='1.0' xmlns:xsl='http://www.w3.org/1999/XSL/Transform'
  xmlns:str="http://exslt.org/strings"
  extension-element-prefixes="str"
  exclude-result-prefixes="str">

  <xsl:template match='/foo'>
    <tokens>
      <xsl:for-each select='str:tokenize(@tokens, " ")'>
        <token><xsl:value-of select='.'/></token>
      </xsl:for-each>
    </tokens>
  </xsl:template>

</xsl:stylesheet>

<!-- XML -->
<?xml version="1.0"?>
<foo tokens='bar baz and some more'/>

<!-- Output -->
<?xml version="1.0"?>
<tokens>
  <token>bar</token>
  <token>baz</token>
  <token>and</token>
  <token>some</token>
  <token>more</token>
</tokens>

Add comment



Highest rated recipes:

1. Search and Replace

2. Generating a newline

3. Internationalization ...

4. Restricting processing ...

5. Result Pagination with ...

6. Fetching information ...

7. Getting text children of ...

8. Creating empty elements




Privacy Policy | Email Opt-out | Feedback | Syndication
© 2006 ActiveState Software Inc. All rights reserved.