RE: PerlSax to parse/search large (~350 MB) file
by Sterin, Ilya other posts by this author
Jul 11 2001 1:50PM messages near this date
view in the new Beta List Site
RE: PerlSax to parse/search large (~350 MB) file
|
Compiling XML::LibXML on SCO
OK, now we understand. The only other problem is that you have an
interpretation of lines, which XML does not use, one tag can be on one line
or twenty lines, it will parse the same way. When you set your handlers you
will have to start keeping track of things starting <conDef> start element.
Keep appending until you find the </conDef> end element.
Then check for the <code> content when you encounter it, if it's not it set
a flag to discard of the <conDef> containing string later, if the code
matches, set a flag and when you finally get to the end of </conDef> abort
parsing and use the string. Remember if you want the full string and the
elements you will have to append tags and content to the string. I haven't
used PerlSax before, but this is easily accomplished with XML::Parser.
Above should give you an overview on how to approach this.
Ilya
-----Original Message-----
From: Corey Smith (s)
To: Sterin, Ilya; 'perl-xml@listserv.ActiveState.com'
Sent: 7/11/01 7:01 AM
Subject: RE: PerlSax to parse/search large (~350 MB) file
Let me try this again. Here's a sample line from the xml file I'm
working
with:
<conDef> <name>Influenza</name><code>C12345</code><id>637</id>...........
...<
/condDef>
I would like to search the file for the content of the <code> tag. Once
the
code is located, the entire line (everything from <conDef> to
</conDef> )
containing that code will be output. Because the file is large,
speed/efficiency is important.
Thanks for the response.
Corey
> -----Original Message-----
> From: Sterin, Ilya [SMTP:Isterin@[...].com]
> Sent: Tuesday, July 10, 2001 11:10 PM
> To: Corey Smith (s); 'perl-xml@listserv.ActiveState.com'
> Subject: RE: PerlSax to parse/search large (~350 MB) file
>
> I'm a little confused as to what you are trying to do. Give us a
better
> example, unless someone here can understand you problem. Are you
looking
> for a specific tag <...> or content of a tag? Once you find it, are
you
> asking how you can extrace the content?
>
> Ilya
>
> > -----Original Message-----
> > From: perl-xml-admin@[...].com
> > [mailto:perl-xml-admin@[...].com]On Behalf Of Corey
Smith
> > (s)
> > Sent: Tuesday, July 10, 2001 6:02 PM
> > To: 'perl-xml@listserv.ActiveState.com'
> > Subject: PerlSax to parse/search large (~350 MB) file
> >
> >
> > The task:
> > Search a large xml file for an identifier contained in an
element.
> > Having located the line associated with the desired identifier,
> > output line from source file to file. Output all other lines to
another
> > file.
> >
> > The problem:
> > Once there is a match on the identifier, how can I identify the
line
> > from the input file so that I can output it to a file?
> >
> > Any help would be greatly appreciated. Thanks.
> >
> >
> >
> >
> > _______________________________________________
> > Perl-XML mailing list
> > Perl-XML@[...].com
> > http://listserv.ActiveState.com/mailman/listinfo/perl-xml
_______________________________________________
Perl-XML mailing list
Perl-XML@[...].com
http://listserv.ActiveState.com/mailman/listinfo/perl-xml
|