Re: String API
by Leopold Toetsch other posts by this author
Aug 22 2003 8:01AM messages near this date
Re: String API
|
Re: String API
Benjamin Goldberg <ben.goldberg@[...].com> wrote:
> Leopold Toetsch wrote:
> > I have problems imaginating such kind of STRINGs.
> You lack sufficient imagination -- Larry's suggested that Perl6 strings
> may consist of a list of chunks. I can easily imagine each of those
> "chunks" being full-fledged STRING* objects.
Did Larry speak of PerlString or STRING?
> A foolish question: can you imagine strings which are lazily read from a
> file?
Sure.
> ... If we could have str->strstart as a pointer to a
> vector of STRING*s, we wouldn't need any PMC to contain the chunks. And
> the str->encoding api is (already) sufficient for doing the work. The
> only lack is a custom mark, to keep the sub-strings alive.
So you have everything what a string *PMC* has: a list of chunks
(hanging off some pointer), custom mark, one or 2 vtables (encoding
stuff) ...
> If we have it in a PerlString derived class, and do not make it part of
> STRING*, then we cannot pass such strings to C functions defined to
> accept strings in STRING* parameters,
Such C functions must be aware of the string API anyway, they can't
assume to get a char * something, they have to call the iterator
interface.
> Well, except that when a PerlInt loses magic going to an INTVAL, the
> resulting integer generally takes *less* memory than it did as a PMC,
> whereas losing magic by changing from a PMC to a STRING could very
> easily result in using *more* memory. (And doing lots of work, which we
> wouldn't need if our string kept it's magic).
That's right. But your (or Larry's) proposed list of chunk with custom
mark is a PMC effectively, if you call it STRING or not doesn't matter.
Its a string PMC with a special vtable. The chunk list contains STRING*
buffers. That's it.
> my str $slurp = File.new($filename).slurp(); # =
> File.slurp($filename)?
> Sure, we could have this read in the whole file, but wouldn't it be
> nicer if it would *lazily* fill in $slurp?
Isn't there a big fat warning in $doc, to avoid such kind of code?
Anyway either the string iterator calls the file iterator getting the
string or above code is illegal as tie()ing an "int".
> > Do you really want to slow down all string access, just for one very
> > special corner case?
> I don't believe that it *would* slow down all string access.
2 more indirections for the chunk buffer: its variable sized so its a
buffer header + buffer memory. And we are creating new strings all over
the place which really hurts already now.
> For the current string code, we already take O(n) to get a void* pointer
> into an appropriate part of a utf8 string, for each character-index.
Dan said, we don't do operations on such kind of string encodings. OTOH
if the chunks all have a character count, we can quickly locate a
certain position inside such strings.
leo
Thread:
Benjamin Goldberg
Benjamin Goldberg
Dan Sugalski
Benjamin Goldberg
Benjamin Goldberg
Benjamin Goldberg
Gordon Henriksen
Gordon Henriksen
Leopold Toetsch
Dan Sugalski
Benjamin Goldberg
Nicholas Clark
Peter Gibbs
Dan Sugalski
Leopold Toetsch
Benjamin Goldberg
Nicholas Clark
Elizabeth Mattijsen
Dan Sugalski
Benjamin Goldberg
Benjamin Goldberg
Leopold Toetsch
Benjamin Goldberg
Tim Bunce
Leopold Toetsch
Luke Palmer
|