parse_balanced_chunk with document context
by Nicolas Mendoza other posts by this author
Jul 10 2007 11:52PM messages near this date
view in the new Beta List Site
Re: Can I prevent XML::DOM::Parser from resolving character entities?
|
[ANN] XML::SAX::ExpatXS 1.30
& XSLT Hi,
I have just started using XML::LibXML 1.63 and have run into a problem
when parsing chunks of XML that contain HTML entities.
There is no problem when parsing an entire document where I can include
definitions of various (X)HTML entities that XML don't contain, but once I
use the parse_balanced_chunk function that makes a document fragment out
of a balanced chunk of XML, then I can't tell it to use a set of
definitions from a document it could belong to.
I solved this partly by hacking into the C parts of LibXML allowing for an
additional optional parameter that could be a context document (as the
libxml2 function that is used by parse_balanced_chunk allows this).
However, I have little experience making C modules for perl, and I might
be doing something wrong, as when I later use the resulting DOM fragment
with $dom_frag-> parentNode()->replaceChild($dom_frag,$inc_ele) it
segfaults while reconsiling things. I suspect a) the code is not tested in
the case of a dom fragment having a context document, or b) the pointer to
the document might be wrongfully altered or wrong inside the C code.
The diff for my changes is currently: http://utilitybase.com/paste/4826
Example code using this: http://utilitybase.com/paste/4827
However, I was told that it might not be possible to hack in this
functionality at all so if it isn't possible, does anyone have any hints
on how I could parse a string of XML containing (X)HTML entities so that I
can insert it into an XML document containing the right definitions?
(My temporary solution is to convert the (X)HTML entities to number
entities right before using parse_balanced_chunk, and I _could_ also use
CDATA wrappers around the code, but I don't like those solutions really.)
Apologies upfront if I failed to meet any criteria required when posting
here. Feel free to enlighten me on how to do so correctly if that is the
case.
--
Thanks,
Nicolas Mendoza
Using Opera's revolutionary e-mail client: http://www.opera.com/mail/
_______________________________________________
Perl-XML mailing list
Perl-XML@[...].com
To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
|