Re: Rule Design Benchmark/Resource Question
by Rocky Olsen other posts by this author
Mar 31 2005 3:57PM messages near this date
Re: Rule Design Benchmark/Resource Question
|
RE: my girlfriend is getting ticked :)
Thanks
On Thu, Mar 31, 2005 at 05:16:25PM -0500, Matt Kettler wrote:
> Rocky Olsen wrote:
>
> >Before i pull my hair out doing bench/resource test, i was wondering if
> >anyone out there knew if there was much of a speed/resource usage
> >difference between the following way of writing the same rule.
> >
> >
> >Method A:
> >body rule_a /(?:feh|meh|bleh)/i
> >
> >vs.
> >
> >Method B:
> >
> >bod __rule_a /(?:feh)/i
> >body __rule_b /(?:meh)/i
> >body __rule_c /(?:bleh)/i
> >
> >meta rule_d (__rule_a || __rule_b || __rule_c)
> >
> >
> >There probably isn't much difference using just 3 rules, but i'm thinking
> >more along the lines of large(500+) lists and it isn't limited to just body
> >stuff. So if anyone has some realworld benching/experience with what is
> >preferred or if the developers know which is faster for SA, i would love
> >the input.
> >
> >
>
> To start with, use perl's regex debugger as your friend:
>
> $perl -Mre=debug -e "/(?:feh|meh|bleh)/i"
> size 11 Got 92 bytes for offset annotations.
>
> $ perl -Mre=debug -e "/(?:feh)/i"
> Freeing REx: `","'
> Compiling REx `(?:feh)'
> size 3 Got 28 bytes for offset annotations.
>
> (repeat 2 times)
>
> However, this only deals with part of the story. The cost of the regex
> itself. It does not deal with the per-rule overhead in SA.
>
> In general I'd favor the combined approach, unless for some reason your
> combined rule is considerably larger than the sum of it's parts. Bigevil
> ran much better once Chris S did some combining and common subexpression
> elimination.
>
>
>
>
> Also, I'd suggest eliminating the (?:) for the single-text-matches. It
> does nothing of use, and doesn't change the evaluation of the regex any
> for a simple single text match. All it does is waste 4 bytes of disk
> space per rule.
>
> body __RULE_A /feh/i
>
> instead of:
> body __RULE_A /(?:feh)/i
>
> I leave comparing the two using re=debug as an exercise for the student.
> Also compare to /(feh)/i and /(feh)\1/i to see how backtracking works.
>
>
>
>
>
>
>
--
______________________________________________________________________
what's with today, today?
Email: rocky@[...].org
PGP: http://rocky.mindphone.org/rocky_mindphone.org.gpg
Attachments:
signature.asc
unknown1
Thread:
Rocky Olsen
Robert Menschel
Matt Kettler
Rocky Olsen
|