ASPN ActiveState Programmer Network  
ActiveState, a division of Sophos
/ Home / Perl / PHP / Python / Tcl / XSLT /
/ Safari / My ASPN /
Cookbooks | Documentation | Mailing Lists | Modules | News Feeds | Products | User Groups
Submit Recipe
My Recipes

All Recipes
All Cookbooks


View by Category

Title: Convert a flat xml file into a structured
Submitter: Friedhelm Duesterhoeft (other recipes)
Last Updated: 2002/09/25
Version no: 1.0
Category: Miscellaneous

 

Not Rated yet


Description:

Most csv to xml converters produce some kind of flat xml code which often needs to be transformed in a more usable (i.e. structured) representation or to match your existing stylesheets. Tags have to be renamed, others have to be inserted or dropped. This styleheet does it all. You even have not to be an XML programmer to customize it's behavior.

Source: Text Source

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
      version="1.0">
<xsl:variable name="Version">0.2.0</xsl:variable>

<xsl:output indent="yes"/>

<xsl:template name="getassocpath">
<xsl:param name="which"/>
<xsl:param name="previousrow" select="false()"/>
  <xsl:choose>
    <xsl:when test="/dsd/input/element/element/element[text()=$which]">
       <xsl:for-each select="/dsd/input/element/element/element[text()=$which]">
         <xsl:variable name="path">
             <xsl:choose>
                <xsl:when test="$previousrow=true()">
                   <xsl:value-of select="concat(../@name,'[$pos - 1]/')"/>
                </xsl:when>
                <xsl:otherwise>
                   <xsl:value-of select="concat(../@name,'[$pos]/')"/>
                </xsl:otherwise>
             </xsl:choose>
         </xsl:variable>
         <xsl:variable name="node">
            <xsl:value-of select="@name"/>
         </xsl:variable>
         <xsl:variable name="nodecount">
            <xsl:value-of select="count(/dsd/input/element/element/element[@name=$node])"/>
         </xsl:variable>
         <xsl:choose>
            <xsl:when test="$nodecount=1">
              <xsl:value-of select="concat($path,$node)"/>
            </xsl:when>
            <xsl:otherwise>
              <xsl:variable name="pos">
                 <xsl:for-each select="/dsd/input/element/element/element[@name=$node]">
                    <xsl:if test="text()=$which">
                       <xsl:value-of select="position()"/>
                    </xsl:if>
                 </xsl:for-each>
              </xsl:variable>
              <xsl:value-of select="concat($path,$node,'[',$pos,']')"/>
            </xsl:otherwise>
         </xsl:choose>
       </xsl:for-each>
    </xsl:when>
    <xsl:otherwise>
       <xsl:for-each select="/dsd/input/element/element/element[attribute=$which]">
          <xsl:variable name="attribute">
             <xsl:value-of select="attribute/@name"/>
          </xsl:variable>
          <xsl:variable name="path">
             <xsl:choose>
                <xsl:when test="$previousrow=true()">
                   <xsl:value-of select="concat(../@name,'[$pos - 1]/')"/>
                </xsl:when>
                <xsl:otherwise>
                   <xsl:value-of select="concat(../@name,'[$pos]/')"/>
                </xsl:otherwise>
             </xsl:choose>
          </xsl:variable>
          <xsl:variable name="node">
             <xsl:value-of select="@name"/>
          </xsl:variable>
         <xsl:variable name="nodecount">
            <xsl:value-of select="count(/dsd/input/element/element/element[@name=$node])"/>
         </xsl:variable>
          <xsl:choose>
             <xsl:when test="$nodecount=1">
                <xsl:value-of select="concat($path,$node,'/@',$attribute)"/>
             </xsl:when>
             <xsl:otherwise>
                <xsl:for-each select="../element/attribute[@name=$attribute]">
                   <xsl:variable name="pos" select="position()"/>
                   <xsl:if test=".=$which">
                          <xsl:value-of select="concat($path,$node,'[',$pos,']','/@',$attribute)"/>
                   </xsl:if>
                </xsl:for-each>
             </xsl:otherwise>
          </xsl:choose>
       </xsl:for-each>
    </xsl:otherwise>
  </xsl:choose>
</xsl:template>

<xsl:template name="ngroupstring">
<xsl:param name="ogroupstring"/>
  <xsl:choose>
     <xsl:when test="not(grouping/@on = 'yes')">
        <xsl:value-of select="$ogroupstring"/>
     </xsl:when>
     <xsl:otherwise>
        <xsl:choose>
           <xsl:when test="grouping/@field">
              <xsl:choose>
                 <xsl:when test="$ogroupstring=''">
                    <xsl:call-template name="getassocpath">
                       <xsl:with-param name="which" select="grouping/@field"/>
                    </xsl:call-template>
                 </xsl:when>
                 <xsl:otherwise>
                    <xsl:variable name="ngroupstring">
                       <xsl:call-template name="getassocpath">
                          <xsl:with-param name="which" select="grouping/@field"/>
                       </xsl:call-template>
                    </xsl:variable>
                    <xsl:value-of select="concat($ogroupstring,',',$ngroupstring)"/>
                 </xsl:otherwise>
              </xsl:choose>
           </xsl:when>
           <xsl:otherwise>
              <xsl:value-of select="$ogroupstring"/>
           </xsl:otherwise>
        </xsl:choose>
     </xsl:otherwise>
  </xsl:choose>
</xsl:template>

<xsl:template name="attributes">
  <xsl:for-each select="attribute">
     <xsl:choose>
       <xsl:when test="text/@DataSource">
          <xsl:attribute name="{@name}">
                    <xsl:text>{</xsl:text>
                    <xsl:call-template name="getassocpath">
                       <xsl:with-param name="which" select="text/@DataSource"/>
                    </xsl:call-template>
                    <xsl:text>}</xsl:text>
          </xsl:attribute>
       </xsl:when>
       <xsl:otherwise>
          <xsl:attribute name="{@name}"><xsl:value-of select="."/></xsl:attribute>
       </xsl:otherwise>
     </xsl:choose>
  </xsl:for-each>
</xsl:template>

<xsl:template name="element-call">
<xsl:param name="groupstring"/>
   <xsl:variable name="ngroupstring">
      <xsl:choose>
         <xsl:when test="string-length($groupstring) &gt; 0">
            <xsl:choose>
               <xsl:when test="starts-with($groupstring,',')">
                   <xsl:value-of select="substring-after($groupstring,',')"/>
               </xsl:when>
               <xsl:when test="contains($groupstring,',')">
                  <xsl:value-of select="concat('concat(',$groupstring,')')"/>
               </xsl:when>
               <xsl:otherwise>
                  <xsl:value-of select="$groupstring"/>
               </xsl:otherwise>
            </xsl:choose>
         </xsl:when>
         <xsl:otherwise>
            <xsl:value-of select="'$oldgroup'"/>
         </xsl:otherwise>
      </xsl:choose>
   </xsl:variable>
   <xsl:element name="xsl:call-template">
      <xsl:attribute name="name"><xsl:value-of select="@name"/></xsl:attribute>
      <xsl:element name="xsl:with-param">
         <xsl:attribute name="name">pos</xsl:attribute>
         <xsl:attribute name="select">$pos</xsl:attribute>
      </xsl:element>
      <xsl:element name="xsl:with-param">
         <xsl:attribute name="name">eof</xsl:attribute>
         <xsl:attribute name="select">$eof</xsl:attribute>
      </xsl:element>
      <xsl:element name="xsl:with-param">
      <xsl:attribute name="name">oldgroup</xsl:attribute>
          <xsl:attribute name="select"><xsl:value-of select="$ngroupstring"/></xsl:attribute>
      </xsl:element>
    </xsl:element>  
</xsl:template>

<xsl:template name="element-calls">
<xsl:param name="grouping"/>
<xsl:param name="groupstring"/>
   <xsl:for-each select="element">
      <xsl:choose>
         <xsl:when test="$grouping=true()">
            <xsl:element name="{@name}">
               <xsl:call-template name="attributes"/>
               <xsl:call-template name="element-call">
                  <xsl:with-param name="groupstring" select="$groupstring"/>
               </xsl:call-template>
            </xsl:element>
         </xsl:when>
         <xsl:otherwise>
            <xsl:call-template name="element-call">
               <xsl:with-param name="groupstring" select="$groupstring"/>
            </xsl:call-template>
         </xsl:otherwise> 
       </xsl:choose>
   </xsl:for-each>
</xsl:template>

<xsl:template name="next-element">
   <xsl:element name="xsl:call-template">
      <xsl:attribute name="name"><xsl:value-of select="@name"/></xsl:attribute>
      <xsl:element name="xsl:with-param">
          <xsl:attribute name="name">pos</xsl:attribute>
          <xsl:attribute name="select">$pos + 1</xsl:attribute>
      </xsl:element>
      <xsl:element name="xsl:with-param">
          <xsl:attribute name="name">eof</xsl:attribute>
          <xsl:attribute name="select">$eof</xsl:attribute>
      </xsl:element>
      <xsl:element name="xsl:with-param">
          <xsl:attribute name="name">oldgroup</xsl:attribute>
          <xsl:attribute name="select">$oldgroup</xsl:attribute>
      </xsl:element>
   </xsl:element>
</xsl:template>

<xsl:template name="element-skip">
<xsl:param name="self"/>
<xsl:param name="subelements"/>
<xsl:param name="groupstring"/>
   <xsl:choose>
      <xsl:when test="$self=true()">
          <xsl:element name="{@name}">
              <xsl:call-template name="attributes"/>
              <xsl:call-template name="element-calls">
                 <xsl:with-param name="grouping" select="$subelements"/>
                 <xsl:with-param name="groupstring" select="$groupstring"/>
              </xsl:call-template>
           </xsl:element>
       </xsl:when>
       <xsl:otherwise>
           <xsl:call-template name="element-calls">
               <xsl:with-param name="grouping" select="$subelements"/>
               <xsl:with-param name="groupstring" select="$groupstring"/>
           </xsl:call-template>
       </xsl:otherwise>
   </xsl:choose>
</xsl:template>

<xsl:template name="element-template">
<xsl:param name="skip" select="true()"/>
<xsl:param name="self"/>
<xsl:param name="subelements"/>
<xsl:param name="groupstring"/>
   <xsl:choose>
      <xsl:when test="$skip=false()">
         <xsl:choose>
            <xsl:when test="$self=true()">
               <xsl:element name="{@name}">
                  <xsl:call-template name="attributes"/>
                  <xsl:call-template name="element-calls">
                     <xsl:with-param name="grouping" select="$subelements"/>
                  </xsl:call-template>
               </xsl:element>
            </xsl:when>
            <xsl:otherwise>
               <xsl:call-template name="element-calls">
                  <xsl:with-param name="grouping" select="$subelements"/>
                </xsl:call-template>
            </xsl:otherwise>
         </xsl:choose>
      </xsl:when>
      <xsl:otherwise>
         <xsl:element name="xsl:if">
           <xsl:attribute name="test">$pos &lt;= $eof</xsl:attribute>
           <xsl:element name="xsl:variable">
              <xsl:attribute name="name">newgroup</xsl:attribute>
              <xsl:if test="string-length($groupstring) &gt; 0">
              <xsl:attribute name="select">
                 <xsl:choose>
                    <xsl:when test="contains($groupstring,',')">
                       <xsl:value-of select="concat('concat(',$groupstring,')')"/>
                    </xsl:when>
                    <xsl:otherwise>
                       <xsl:value-of select="$groupstring"/>
                    </xsl:otherwise>
                 </xsl:choose>
              </xsl:attribute>
              </xsl:if>
           </xsl:element>
           <xsl:element name="xsl:if">
              <xsl:attribute name="test">$oldgroup = $newgroup</xsl:attribute>
              <xsl:choose>
                  <xsl:when test="grouping/@field">
                     <xsl:element name="xsl:variable">
                         <xsl:attribute name="name">old</xsl:attribute>
                         <xsl:attribute name="select">
                            <xsl:call-template name="getassocpath">
                               <xsl:with-param name="which" select="grouping/@field"/>
                               <xsl:with-param name="previousrow" select="true()"/>
                            </xsl:call-template>
                         </xsl:attribute>
                     </xsl:element>
                     <xsl:element name="xsl:variable">
                        <xsl:attribute name="name">new</xsl:attribute>
                        <xsl:attribute name="select">
                            <xsl:call-template name="getassocpath">
                               <xsl:with-param name="which" select="grouping/@field"/>
                            </xsl:call-template>
                        </xsl:attribute>
                     </xsl:element>
                     <xsl:element name="xsl:if">
                        <xsl:attribute name="test">not($old = $new)</xsl:attribute>
                        <xsl:call-template name="element-skip">
                           <xsl:with-param name="self" select="$self"/>
                           <xsl:with-param name="subelements" select="$subelements"/>
                           <xsl:with-param name="groupstring">
                               <xsl:call-template name="ngroupstring">
                                  <xsl:with-param name="ogroupstring" select="$groupstring"/>
                               </xsl:call-template>
                           </xsl:with-param>
                        </xsl:call-template>
                     </xsl:element>
                  </xsl:when>
                  <xsl:otherwise>
                     <xsl:call-template name="element-skip">
                        <xsl:with-param name="self" select="$self"/>
                        <xsl:with-param name="subelements" select="$subelements"/>
                     </xsl:call-template>
                  </xsl:otherwise>
              </xsl:choose>
              <xsl:call-template name="next-element"/>
           </xsl:element>
        </xsl:element>
      </xsl:otherwise>
   </xsl:choose>
</xsl:template>

<xsl:template name="element">
<xsl:param name="grouping"/>
<xsl:param name="self"/>
<xsl:param name="skip"/>
<xsl:param name="subelements"/>
<xsl:param name="groupstring"/>
   <xsl:element name="xsl:template">
      <xsl:attribute name="name"><xsl:value-of select="@name"/></xsl:attribute>
      <xsl:element name="xsl:param">
          <xsl:attribute name="name">pos</xsl:attribute>
      </xsl:element>
      <xsl:element name="xsl:param">
          <xsl:attribute name="name">eof</xsl:attribute>
      </xsl:element>
      <xsl:element name="xsl:param">
          <xsl:attribute name="name">oldgroup</xsl:attribute>
      </xsl:element>
      <xsl:call-template name="element-template">
         <xsl:with-param name="self" select="$self"/>
         <xsl:with-param name="subelements" select="$subelements"/>
         <xsl:with-param name="skip" select="$skip"/>
         <xsl:with-param name="groupstring" select="$groupstring"/>
      </xsl:call-template>
   </xsl:element>
   <xsl:variable name="ngroupstring">
      <xsl:call-template name="ngroupstring">
         <xsl:with-param name="ogroupstring" select="$groupstring"/>
      </xsl:call-template>
   </xsl:variable>
   <xsl:for-each select="element">
       <xsl:variable name="nself" select="$subelements=false()"/>
       <xsl:variable name="nsubelements" select="$grouping=true() and (grouping/@on='yes' and element/grouping/@on='yes' and not(element/grouping/@field))"/>
       <xsl:variable name="nskip" select="$grouping=true() and ((grouping/@on='yes' and not(element/grouping/@on='yes')) or (not(grouping/@on='yes') and ../grouping/@field) or (grouping/@field))"/>
       <xsl:call-template name="element">
          <xsl:with-param name="grouping" select="$grouping=true() and grouping/@on"/>
          <xsl:with-param name="self" select="$nself"/>
          <xsl:with-param name="subelements" select="$nsubelements"/>
          <xsl:with-param name="skip" select="$nskip"/>
          <xsl:with-param name="groupstring" select="$ngroupstring"/>
       </xsl:call-template>
   </xsl:for-each>
</xsl:template>


<xsl:template match="/">
  <xsl:element name="xsl:stylesheet" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
     <xsl:attribute name="xmlns:xsl">http://www.w3.org/1999/XSL/Transform</xsl:attribute>
     <xsl:attribute name="version">1.0</xsl:attribute>
     <xsl:comment>

    XSL Flat File Conversion V<xsl:value-of select="$Version"/>
    Copyright (C) 2002 by Friedhelm Duesterhoeft
    Generated with fx2sx.xsl

    This program is free software; you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
    the Free Software Foundation; either version 2 of the License, or
    (at your option) any later version.

    This program is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
    GNU General Public License for more details.

    You should have received a copy of the GNU General Public License
    along with this program; if not, write to the Free Software
    Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.

</xsl:comment>

   <xsl:element name="xsl:output"><xsl:attribute name="indent">yes</xsl:attribute></xsl:element>
   <xsl:variable name="root" select="/dsd/input/element[1]/@name"/>
   <xsl:variable name="row" select="/dsd/input/element[1]/element[1]/@name"/>
   <xsl:for-each select="/dsd/output/element">
     <xsl:call-template name="element">
        <xsl:with-param name="grouping" select="grouping/@on='yes'"/>
        <xsl:with-param name="self" select="(grouping/@on='yes' and grouping/@field) or not(grouping/@on)"/>
        <xsl:with-param name="subelements" select="grouping/@on='yes' and element/grouping/@on='yes' and not(grouping/@field) and not(element/grouping/@field)"/>
        <xsl:with-param name="skip" select="not(grouping/@on='yes') or grouping/@field or not(element/grouping/@on='yes')"/>
     </xsl:call-template>
   </xsl:for-each>

  <xsl:element name="xsl:template">
       <xsl:attribute name="match"><xsl:value-of select="$root"/></xsl:attribute>
       <xsl:for-each select="/dsd/output/element">
         <xsl:choose>
            <xsl:when test="grouping/@on and not(grouping/@field)">
               <xsl:element name="{@name}">
                 <xsl:call-template name="attributes"/>
                 <xsl:element name="xsl:call-template">
                    <xsl:attribute name="name"><xsl:value-of select="@name"/></xsl:attribute>
                    <xsl:element name="xsl:with-param">
                       <xsl:attribute name="name">pos</xsl:attribute>
                       <xsl:attribute name="select">1</xsl:attribute>
                    </xsl:element>
                    <xsl:element name="xsl:with-param">
                       <xsl:attribute name="name">eof</xsl:attribute>
                       <xsl:attribute name="select">count(<xsl:value-of select="$row"/>)</xsl:attribute>
                    </xsl:element>
                 </xsl:element>
               </xsl:element>
            </xsl:when>
            <xsl:otherwise>
               <xsl:element name="xsl:call-template">
                <xsl:attribute name="name"><xsl:value-of select="@name"/></xsl:attribute>
                <xsl:element name="xsl:with-param">
                    <xsl:attribute name="name">pos</xsl:attribute>
                    <xsl:attribute name="select">1</xsl:attribute>
                </xsl:element>
                <xsl:element name="xsl:with-param">
                    <xsl:attribute name="name">eof</xsl:attribute>
                    <xsl:attribute name="select">count(<xsl:value-of select="$row"/>)</xsl:attribute>
                </xsl:element>
              </xsl:element>
            </xsl:otherwise>
          </xsl:choose>
       </xsl:for-each>
   </xsl:element>

   <xsl:element name="xsl:template">
      <xsl:attribute name="match">/</xsl:attribute>
      <xsl:element name="xsl:apply-templates">
         <xsl:attribute name="select"><xsl:value-of select="$root"/></xsl:attribute>
      </xsl:element>
   </xsl:element>

  </xsl:element>
</xsl:template>

</xsl:stylesheet>

The license for this recipe is available here.

Discussion:

The output produced by csv to xml converters or xsql processors usually looks like:


value1
value2
value3




However, you would like to have it as



value-of-field3




The creation of a suitable XSL stylesheet for this job can be automated using my recipe stylesheet.

First you have to create an XML file which describes your data structures (input and output). You can grab the DTD for this file from http://www.msdd.net/fx2sx/sx.dtd (in fact you must download it to get it running).

Second you have to apply the above stylesheet to this XML file. The result will be another XSL stylesheet.

Third apply the generated stylesheet to your flat XML input data - your data gets transformed as described by you.

It's easy to wrap the steps necessary into a shell script to do the job.
As a limitation your input data must be structured into exactly 3 nesting levels (rows,row,field in my example). However, the names of the levels can be specified in the input description file.

For more information and a sample see http://www.msdd.net/fx2sx.



Add comment

No comments.



Highest rated recipes:

1. Search and Replace

2. Generating a newline

3. Internationalization ...

4. Restricting processing ...

5. Result Pagination with ...

6. Fetching information ...

7. Getting text children of ...

8. Creating empty elements




Privacy Policy | Email Opt-out | Feedback | Syndication
© 2006 ActiveState Software Inc. All rights reserved.