ASPN ActiveState Programmer Network
ActiveState
/ Home / Perl / PHP / Python / Tcl / XSLT /
/ Safari / My ASPN /
Cookbooks | Documentation | Mailing Lists | Modules | News Feeds | Products | User Groups


Recent Messages
List Archives
About the List
List Leaders
Subscription Options

View Subscriptions
Help

View by Topic
ActiveState
.NET Framework
Open Source
Perl
PHP
Python
Tcl
Web Services
XML & XSLT

View by Category
Database
General
SOAP
System Administration
Tools
User Interfaces
Web Programming
XML Programming


MyASPN >> Mail Archive >> xsl-list
xsl-list
Re: [xsl] Word Highlighting
by Mike Brown other posts by this author
Nov 5 2002 10:36PM messages near this date
RE: [xsl] Word Highlighting | Re: [xsl] Word Highlighting
Paul Terray wrote:
>  >Which processor are you using? Entities will not generate separate text 
>  >nodes in the data model, il.e. a text node never has an immediately 
>  >following or preceding sibling that is a text node - see 
>  ><http://www.w3.org/TR/xpath#section-Text-Nodes>.
>  
>  MSXML 3.0 and 4.0 exhibit this behavior. Perhaps is it linked to my entity 
>  definition :
>  <!ENTITY eacute "&#38;#x00E9;">

Their XPath implementation is broken since it doesn't treat sibling text nodes
as if they were merged. IIRC, there's a normalize method you can call on the
document node to merge all the text nodes. Someone more familiar with MSXML
will have to comment. Note that a Google search for msxml merge text nodes
turned up
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/xmlsdk30/htm/xmmthnormalize
.asp
pretty quickly. For MSXML questions, always check the docs at MS first! :)

Your entity definition is not the cause of the problem, but it will cause
problems of its own. It just says that "&eacute;" in your document should be
replaced with the string "&#x00E9;" (8 characters). That may be what you want
in the serialized output, but entities only apply to input. To get something
close to what you want in the output, you should define eacute as being the
single character "&#xE9;" and then let the serializer part of the XSLT
processor take care of emitting the right reference automaticaly. <xsl:output
method="xml" encoding="us-ascii"/>  will help in this regard; Unicode character
E9 can't be represented in ASCII, so it will be serialized as something like
"&#233;", most likely. And make sure you aren't capturing the output in a
16-bit String object, or it'll be UTF-16, regardless of what encoding you
asked for in xsl:output.

   - Mike
____________________________________________________________________________
  mike j. brown                   |  xml/xslt: http://skew.org/xml/
  denver/boulder, colorado, usa   |  resume: http://skew.org/~mike/resume/

 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
Thread:
Paul Terray
Mike Brown
Paul Terray

Privacy Policy | Email Opt-out | Feedback | Syndication
© ActiveState Software Inc. All rights reserved