Re: A whitespace issue in XML::LibXML
by Richard E. Rathmann other posts by this author
Jul 20 2007 4:04PM messages near this date
view in the new Beta List Site
Re: A whitespace issue in XML::LibXML
|
Re: A whitespace issue in XML::LibXML
& XSLT Birgit Kellner wrote:
> Vaclav Barta schrieb:
>
> > On Friday 20 July 2007 20:27, Birgit Kellner wrote:
> >
> >
> >> Petr Pajas schrieb:
> >>
> >>
> >>> First of all, why do you do that "by hand"? To get all text nodes
> >>> from a subtree nicely concatenated, you can use e.g.
> >>>
> >>> $text = $node->findvalue('string(.)')
> >>>
> >>>
> >> I should have been more specific on that. I'm not interested in all text
> >> nodes, but in text node children of the element <seg>, and in text node
> >>
> >>
> > seg/text()
> >
> >
> >
> >> children of the element <span> that can be contained within <seg>. <seg>
> >>
> >>
> > seg/span/text()
> >
> > XPath may not be very perlish, but it's quite useful...
> >
> > Bye
> > Vasek
> >
> >
> >
> Yes, but that wouldn't get me the proper sequence, no?
>
> Consider:
>
> <seg>This is <span>a cold breeze</span><note>Oh, and here's some text
> which deals with a completely different subject-matter, say, bunny
> rabbits on a balcony.</note> on an unbearably hot summer evening.</seg>
>
Assuming that $seg contains a reference to the "seg" node in the example
above:
my $seg_text = '';
for my $node ($seg-> findnodes('text() | span'))
{
$seg_text .= $node-> findvalue('string(.)');
}
will produce "This is a cold breeze on an unbearably hot summer
evening." Is this what you are trying to achieve (or at least steer
you in the right direction)?
FYI, the 'text() | span' XPath expression above selects just the raw
text node and "span" element children of the "seg" element, skipping the
"note" element children (or any other child nodes, for that matter). By
combining the two separate relative XPaths "text()" and "span" with an
or operator, findnodes() will return all the nodes in the proper
sequence. Then, as Petr suggested, findvalue('string(.)') will return
the concatenated text of the current node and any of its descendants.
HTH,
Richard
_______________________________________________
Perl-XML mailing list
Perl-XML@[...].com
To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
Thread:
Birgit Kellner
Petr Pajas
Birgit Kellner
A. Pagaltzis
Vaclav Barta
Birgit Kellner
Richard E. Rathmann
Vaclav Barta
Petr Pajas
Vaclav Barta
Mark - BLS CTR Thomas
|