Re: Lexical variables, scratchpads, closures, ...
by Jerome Vouillon other posts by this author
Aug 4 2002 4:31PM messages near this date
Re: Lexical variables, scratchpads, closures, ...
|
Re: Lexical variables, scratchpads, closures, ...
On Fri, Aug 02, 2002 at 09:06:31PM +0100, Nicholas Clark wrote:
> On Fri, Aug 02, 2002 at 06:43:49PM +0200, Jerome Vouillon wrote:
> > Allocating a hash will certainly remain a lot more expensive than
> > allocating an array. And we are going to allocate pads pretty
> > often...
>
> Are we? Or are we just going to copy them a lot at runtime?
> [but actually allocate the beasts only at compile time, or when someone
> mucks around with them using string eval or %MY]
When we enter a block we need to:
- create a new pad
- copy the current pad array and add the new pad to the copy
- allocate a PMC for each lexical variable introduced in the block and
insert the PMCs in the new pad
The creation of the pad could be done either by copying a template
pad. This would not make any difference compared to allocating from
scratch for an array, but this must be faster for a hash. Still, I
think using an array will remain faster.
> > I really hope %MY and caller.MY will not be able to extend the lexical
> > scope: we would get a really huge speed penalty for a feature which
> > would hardly ever be used.
>
> I think we need to be aware of this, but I don't think it has to be that
> painful. If all scopes have a pad pointer, but lexical-free scopes start
> with it null, then there won't be a really huge speed penalty.
> [it might actually be faster, as this way all pads *are* equal, so there
> doesn't need to be special case code distinguishing between scopes with
> lexicals and scopes without.]
I think this make a large difference for variable access. Here is a
comparison.
Consider the following example:
$x = 1;
my $y = 1;
sub foo () {
bar ();
print "$x $y\n";
}
We consider the two possibilities:
* The lexical scopes cannot be extended
We can implement the lexical scopes with an array of arrays. For this
example, we don't need to allocate a pad for the subroutine foo, as it
has no lexical variable. So, in subroutine foo, the layout of the
lexical scopes is the following :
outer
array pad
+--+ +--+
| -+---> | -+--> $y PMC
+--+ +--+
So, accessing the $y variable require 5 memory reads (5 assembly
instructions on a i386 machine):
- one read to get the address of the outer array from a parrot register
- two reads to get the address of the pad
- two reads to get the address of the PMC.
This could be further improved:
- the address of the outer array could probably be placed in a machine
register, which would save one memory read
- we may be able to save one memory read per array look-up if we can
avoid wrapping them in a PMC.
This would get us down to 2 memory reads.
On architectures which are not register starved (about anything but
the i386 architecture), it may be worthwhile to store some of the pad
addresses in machine register: this would save yet another memory
read.
I believe accessing a global variable such as $y can be made hardly
more costly: just one ore two more memory reads. In particular, we
can avoid a hash look-up.
* The lexical scopes can be extended
I think we need to implement pads as hashes, not arrays, in this case.
Also, we need a pad even for blocks that does not introduce any
lexical variable, because a variable may be inserted latter. So, for
the example, the layout of the lexical scopes looks like:
Pads
(hashes)
+--+ +--+
| -+---> | | (empty pad for subroutine foo)
+--+ +--+
| -+-
+--+ \ +--+
-> | -+--> $y PMC
+--+
Now, if we want to access $y, we first need to perform a look-up in
the first pad, usually empty, but which may contain a lexical variable
$y if bar is defined as:
sub bar () {
caller.MY{$y} = 2;
}
Then, if this fails, we perform a look-up in the outer scope.
Most of the time, lexical variables will be defined in one of the
first pads, so I think we can expect the cost of accessing a lexical
variable to be around one or two hash lookup. This is already a lot
more costly than a few memory reads.
For global variables ($x and &print in this example), I think this is
worse: we must look into all the hashes, just in case they become
shadowed by a lexical variable, for instance if bar is defined as:
sub bar () {
caller.MY{$x} = 2;
}
or as:
sub bar () {
caller.OUTER.MY{$x} = 2;
}
-- Jerome
Thread:
Jerome Vouillon
Melvin Smith
Jerome Vouillon
Sean O'Rourke
Jerome Vouillon
Dan Sugalski
Dan Sugalski
Simon Cozens
Nicholas Clark
Sean O'Rourke
Jerome Vouillon
Nicholas Clark
Jerome Vouillon
Jonathan Sillito
Dave Mitchell
Melvin Smith
Sean O'Rourke
Dan Sugalski
Dan Sugalski
Jonathan Sillito
Jerome Vouillon
Melvin Smith
Jerome Vouillon
Melvin Smith
Jerome Vouillon
Jonathan Sillito
Sean O'Rourke
|