Advance warning of tweaks to Encode API.
by Nick Ing-Simmons other posts by this author
Jan 30 2002 6:02PM messages near this date
Re: CPP insanity
|
MacOS X Issues
Nick Ing-Simmons <nick@[...].net> writes:
> > You can use t/table.euc under Jcode module for instance. table.utf8
> > in my code example is just a utf8 version thereof. That's a data which
> > contains all characters defined in EUC (well, actually JISX0212 is not
> > included but very few environments can display JISX0212).
>
> It is realy great to have some valid data!
>
> For a start it has found a bug in :encoding layer - knew there must be some...
> (I think I have rediscovered the multi-byte char spanning buffer boundary
> bug ... which I could not reproduce before)
That is it - :encoding needs some serious re-work for any encoding
which will winge about partial characters (8-bit never does, and 16-bit
is unlikely to with even-length buffers - but multi-bytes can.
But since layers are much more stable now it can be recoded in a
better manner anyway.
To do that it needs to know why encode/decode stopped - did they "fail"
or just "pause" ? So -> decode and ->encode methods are going to get tweaked
as hinted at in the existing pod.
I am currently leaning towards allowing "check" to be a reference
something like :
$uni = $enc-> decode($octets); # best attempt + replacement chars
$uni = $enc-> decode($octets,0); # croak on error ?
$uni = $enc-> decode($octets,1); # stop on error
$uni = $enc-> decode($octets,\$err); # stop on error reason code in $err
$uni = $enc-> decode($octets,\&foo); # Call foo on error - protocol TBD
I need to think through a sane set of "numeric" check options perhaps
a "mask" of which errors are croak/replace/stop/ignored ?
I think you can deduce something from return value as well,
e.g. returns +ve length but does not consume whole string
then that is result so far. TO find out why
call it again - undef means no representation
- defined but zero length means partial char
- +ve length meant we had run out of room
(does not occur at perl level as SV can grow...)
--
Nick Ing-Simmons
http://www.ni-s.u-net.com/
|