ASPN ActiveState Programmer Network
ActiveState
/ Home / Perl / PHP / Python / Tcl / XSLT /
/ Safari / My ASPN /
Cookbooks | Documentation | Mailing Lists | Modules | News Feeds | Products | User Groups


Recent Messages
List Archives
About the List
List Leaders
Subscription Options

View Subscriptions
Help

View by Topic
ActiveState
.NET Framework
Open Source
Perl
PHP
Python
Tcl
Web Services
XML & XSLT

View by Category
Database
General
SOAP
System Administration
Tools
User Interfaces
Web Programming
XML Programming


MyASPN >> Mail Archive >> perl-xml
perl-xml
Re: RDF::Notation3::PerlSAX2, anyone?
by Bjoern Hoehrmann other posts by this author
Sep 27 2001 8:56PM messages near this date
view in the new Beta List Site
Re: RDF::Notation3::PerlSAX2, anyone? | Re: RDF::Notation3::PerlSAX2, anyone?
* Alberto Reggiori wrote:
> Bjoern Hoehrmann wrote:
> > 
> > Hi,
> > 
> >    Anyone going to write a Notation3 PerlSAX2 parser?
> 
> yes, I planned to build a N3 and N-triples [1] parser in RDFStore but I
> am not done yet; I just hacked an rdf2n3p.pl util using SiRPAC for the
> moment. Any help/contribution is appreciated :)

N-Triples, as they are currently defined (please see my comments on
www-rdf-comments on the current draft) are quite easy to parse:
  
  #!perl -w
  use strict;
  use warnings;

  # ...
  
  while (<HANDLE> ) {
    chomp;              # remove trailing newline
    s/^[\x20\x09]+//;   # remove leading white space
    s/[\x20\x09]+$//;   # remove trailing white space
    next if /^#/;       # skip comments
    next unless /\S/;   # skip empty lines
  
    # syntax checks
    if (/[^\x20-\x7e\x0d\x0a\x09]/)
    {
      # invalid character(s) found
    }
  
    unless (s/\.$//)
    {
      # syntax error: missing trailing full stop
    }
  
    # parse subject
    if (s/^<([^> ]*)>[\x20\x09]+//)
    {
      # uriref
    }
    elsif (s/^_:([A-Za-z][A-Za-z0-9]*)[\x20\x09]+//)
    {
      # bNode
    }
    else
    {
      # syntax error in <subject>  token
    }
  
    # parse predicate
    if (s/^<([^> ]*)>[\x20\x09]+//)
    {
      # uriref
    }
    else
    {
      # syntax error in predicate
    }
  
    # parse object
    if (s/^<([^> ]*)>[\x20\x09]+//)
    {
      # uriref
    }
    elsif (s/^_:([A-Za-z][A-Za-z0-9]*)[\x20\x09]+//)
    {
      # bNode
    }
    elsif (s/"([^"]*)"[\x20\x09]+//)
    {
      # literal
    }
    else
    {
      # syntax error in <object>  token 
    }
  
    if (length)
    {
      # trash found after <object>  token
    }
  }

You still have to unescape the escapes, but in general you are done with
this. Not very beautiful, but it works.

> [1] http://www.w3.org/2001/sw/RDFCore/ntriples/

This document is AFAICT obsolted by the RDF test cases WD.
-- 
Björn Höhrmann { mailto:bjoern@[...].de } http://www.bjoernsworld.de
am Badedeich 7 } Telefon: +49(0)4667/981028 { http://bjoern.hoehrmann.de
25899 Dagebüll { PGP Pub. KeyID: 0xA4357E78 } http://www.learn.to/quote/
_______________________________________________
Perl-XML mailing list
Perl-XML@[...].com
http://listserv.ActiveState.com/mailman/listinfo/perl-xml
Thread:
Bjoern Hoehrmann
Alberto Reggiori
Alberto Reggiori
Bjoern Hoehrmann
Petr Cimprich

Privacy Policy | Email Opt-out | Feedback | Syndication
© ActiveState Software Inc. All rights reserved