ASPN ActiveState Programmer Network
ActiveState
/ Home / Perl / PHP / Python / Tcl / XSLT /
/ Safari / My ASPN /
Cookbooks | Documentation | Mailing Lists | Modules | News Feeds | Products | User Groups


Recent Messages
List Archives
About the List
List Leaders
Subscription Options

View Subscriptions
Help

View by Topic
ActiveState
.NET Framework
Open Source
Perl
PHP
Python
Tcl
Web Services
XML & XSLT

View by Category
Database
General
SOAP
System Administration
Tools
User Interfaces
Web Programming
XML Programming


MyASPN >> Mail Archive >> apache-soap-dev
apache-soap-dev
[jira] Commented: (SOAP-159) Axis mis-encodes Strings w/ invalid characters for SOAP transport
by Jeremy Kleier other posts by this author
Jul 14 2006 8:55AM messages near this date
ezmlm warning | Jesus Freire Costas/Spain/IBM is out of the office.
SERVICES     [ http://issues.apache.org/jira/browse/SOAP-159?page=comments#action_12421139 ]
 
            
Jeremy Kleier commented on SOAP-159:
------------------------------------

I have a related issue with this piece of code:
            case '\r' : strBuf.append("
");
                        break;

I don't believe escaping the carriage return is the proper thing to do here. Carriage return
s *are* valid XML chars, and all that we should be doing in this piece of code is cleaning t
he message to be valid XML.



Jeremy Kleier
jhereg9333@[...].com

>  Axis mis-encodes Strings w/ invalid characters for SOAP transport
>  -----------------------------------------------------------------
> 
>                  Key: SOAP-159
>                  URL: http://issues.apache.org/jira/browse/SOAP-159
>              Project: SOAP
>           Issue Type: Bug
>           Components: All
>     Affects Versions: 2.2
>          Environment: Operating System: Windows XP
>  Platform: PC
>             Reporter: Ryan Choi
>          Assigned To: Matthew J. Duftler
> 
>  Axis doesn�t seem to be properly XML-encoding string values in SOAP 
>  requests/responses. More specifically, org.apache.axis.utils.XmlUtils isn't 
>  stripping out invalid characters before sending them across the wire. An 
>  example of such an invalid string is:
>  2002â�N2�'�½1â���º�³�½�®�'_â�"�±â�¢�ªâ���¦â���¨�...
Ã?¬Ã?'âÂ?°Ã?'CÃ?'ZÃ?'âÂ?Â?Ã?'XâÂ?âÂ?Â?Ã?°âÂ?¢Ã?XâÂ?Â?Ã?¢âÂ?Â?Ã?½âÂ?Â?Ã?µÃ
¢Â?Â?Ã?Â?âÂ?Â?âÂ?¢
>  In this case, there is a definite null character, which is not legal XML, being 
>  sent over the wire. An Axis client receiving this response chokes in parsing 
>  the XML.
>  It looks like the problem may be in org.apache.axis.utils.XmlUtils. The 
>  xmlEncodeString() method only encodes the string if either '&', '"', '\'', '<' 
>  or '>' are found. If none are found, it just returns the original string (even 
>  if it has OTHER invalid characters) and writes it as-is.
>  I've included the XmlUtils.xmlEncodeString() method below, as well as a 
>  suggested fix for it.
>  I'm using the following:
>  Implementation-Title: Apache Axis
>  Implementation-Version: 1.1 1021 June 13 2003
>  Implementation-Vendor: Apache Web Services
>  Java: JDK 1.4.1_02
>  OS: Windows XML Professional Version 2002 SP1
>  CPU: Intel Xeon 3.06GHz, 1.00 GB RAM
>  Any help/suggestions/recommendations would be helpful. Thanks!
>  Ryan Choi
>  rchoi@[...].com
>  ----------------------------------------------
>  Original from XmlUtils:
>      public static String xmlEncodeString(String orig)
>      {
>          if (orig == null)
>          {
>              return "";
>          }
>          char[] chars = orig.toCharArray();
>          // if the string doesn't have any of the magic characters, leave
>          // it alone.
>          boolean needsEncoding = false;
>          search:
>          for(int i = 0; i < chars.length; i++) {
>              switch(chars[i]) {
>              case '&': case '"': case '\'': case '<': case '>':
>                  needsEncoding = true;
>                  break search;
>              }
>          }
>          if (!needsEncoding) return orig;
>          StringBuffer strBuf = new StringBuffer();
>          for (int i = 0; i < chars.length; i++)
>          {
>              switch (chars[i])
>              {
>              case '&'  : strBuf.append("&amp;");
>                          break;
>              case '\"' : strBuf.append("&quot;");
>                          break;
>              case '\'' : strBuf.append("&apos;");
>                          break;
>              case '<'  : strBuf.append("&lt;");
>                          break;
>              case '\r' : strBuf.append("&#xd;");
>                          break;
>              case '>'  : strBuf.append("&gt;");
>                          break;
>              default   : 
>                  if (((int)chars[i]) > 127) {
>                          strBuf.append("&#");
>                          strBuf.append((int)chars[i]);
>                          strBuf.append(";");
>                  } else {
>                          strBuf.append(chars[i]);
>                  }
>              }
>          }
>          return strBuf.toString();
>      }
>  Suggested fix for XmlUtils:
>      public static String xmlEncodeString(String orig)
>      {
>          if (orig == null)
>          {
>              return "";
>          }
>          char[] chars = orig.toCharArray();
>          StringBuffer strBuf = new StringBuffer();
>          for (int i = 0; i < chars.length; i++)
>          {
>              switch (chars[i])
>              {
>              case '&'  : strBuf.append("&amp;");
>                          break;
>              case '\"' : strBuf.append("&quot;");
>                          break;
>              case '\'' : strBuf.append("&apos;");
>                          break;
>              case '<'  : strBuf.append("&lt;");
>                          break;
>              case '\r' : strBuf.append("&#xd;");
>                          break;
>              case '>'  : strBuf.append("&gt;");
>                          break;
>  		case '\n' : // Line Feed is OK
>  		case '\r' : // Carriage Return is OK
>  		case '\t' : // Tab is OK
>  		// These characters are specifically OK, as exceptions to 
>              // the general rule below:
>  				strBuf.append(chars[i]);
>  				break;
>  		default :
>  			if (((c >= 0x20) && (c <= 0xD7FF)) || 
>                        ((c >= 0xE000) && (c <= 0xFFFD))) {
>  				strBuf.append(chars[i]);
>  			}
>  			// For chars outside these ranges (such as control 
>  chars),
>  			// do nothing; it's not legal XML to print these chars,
>  			// even escaped
>              }
>          }
>          return strBuf.toString();
>      }

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache
.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: soap-dev-unsubscribe@[...].org
For additional commands, e-mail: soap-dev-help@ws.apache.org

Privacy Policy | Email Opt-out | Feedback | Syndication
© ActiveState Software Inc. All rights reserved