FW: UTF-8 BOM (was RE: [soapbuilders] Follow-up UTF-8 test)
by other posts by this author
Apr 2 2001 10:39PM messages near this date
Re: Extra parameters
|
RE: UTF-8 BOM (was RE: [soapbuilders] Follow-up UTF-8 test)
We found that we had a problem with this in the Apache implementation when
we used a Reader, rather than a raw InputStream, to create the InputSource
for the XML parser. The Reader apparently got confused by the BOM, and ate
some of it but not all, so the parser couldn't deal. When we switched to
sending the InputStream directly to the parser (in our case, Xerces), all
was well. Just an FYI, in case this might have something to do with the
problem you're seeing.
--Glen
-----Original Message-----
From: Michael Brennan [mailto:michael_brennan@[...]..]
Sent: Monday, April 02, 2001 6:07 PM
To: 'soapbuilders@yahoogroups.com'
Subject: UTF-8 BOM (was RE: [soapbuilders] Follow-up UTF-8 test)
Thanks for the reference. I've overlooked that (and thought that UTF-8 never
includes a BOM).
Interestingly, the XML parser Sun ships with JAXP chokes on this. Now there
is one more thing to test for conformance: XML parsers.
Looks like this one is a bug in Sun's parser. :-(
I wonder how many other XML parsers have problems with this.
-----Original Message-----
From: Fredrik Lundh [mailto:fredrik@[...]..]
Sent: Saturday, March 31, 2001 1:04 AM
To: soapbuilders@[...].com
Subject: Re: [soapbuilders] Follow-up UTF-8 test
michael wrote:
> However, I am still seeing one odd problem: the returned message seemed to
> have some garbage bytes preceding the XML prolog. It appears to be 3 bytes
> whose hex values are: EFBBBF.
>
> I've seen this same sequence of bytes when I save a file in UTF-8 format
> using Notepad.
>
> Any idea what's happening here?
it's a unicode BOM (byte order mark). it's not necessary for
UTF-8, but your parser shouldn't choke on it.
more info here:
http://www.unicode.org/unicode/faq/utf_bom.html
<http://www.unicode.org/unicode/faq/utf_bom.html>
also see appendix F of the XML spec.
Cheers /F
To unsubscribe from this group, send an email to:
soapbuilders-unsubscribe@[...].com
Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service
<http://docs.yahoo.com/info/terms/> .
Attachments:
unknown1
|