ASPN ActiveState Programmer Network
ActiveState
/ Home / Perl / PHP / Python / Tcl / XSLT /
/ Safari / My ASPN /
Cookbooks | Documentation | Mailing Lists | Modules | News Feeds | Products | User Groups


Recent Messages
List Archives
About the List
List Leaders
Subscription Options

View Subscriptions
Help

View by Topic
ActiveState
.NET Framework
Open Source
Perl
PHP
Python
Tcl
Web Services
XML & XSLT

View by Category
Database
General
SOAP
System Administration
Tools
User Interfaces
Web Programming
XML Programming


MyASPN >> Mail Archive >> tcl-core
tcl-core
Re: [TCLCORE] Variable access (was Re: [Tcl9-cloverfield] Parser)
by Neil Madden other posts by this author
May 8 2008 3:32PM messages near this date
[TCLCORE] Variable access (was Re: [Tcl9-cloverfield] Parser) | [TCLCORE] TIP 257 testing?
Hi Frédéric,

I'm not entirely sure of the context of this discussion, or whether  
it is supposed to apply to Tcl itself, or just to Cloverfield (with  
which I have only a passing familiarity). Therefore, please take my  
comments with the appropriate grain of salt -- I may be missing the  
point entirely!

On 7 May 2008, at 14:04, Frédéric Bonnet wrote:
>  ...
>  Variable access semantics.
>  ==========================
> 
>  Variables can be accessed either by value or by reference.  
>  Regarding the
>  latter, there are two main kinds of references:
> 
>  - Weak references, e.g. access-by-name.
>  - Strong references, such as pointers.

I'm not sure whether I understand the distinction being made here. In  
any system of reference there are 3 parts: (i) the reference (name,  
pointer), (ii) the thing refered to, and (iii) some means of mapping  
from (i)s to (ii)s. For instance, in a hashtable the three parts are:  
(i) the hashtable keys, (ii) the objects contained in the hashtable  
entries, (iii) the hashtable itself. Pointers are not special in this  
regard, where the three parts are (i) the pointer (an integer,  
basically), (ii) some object at some location in memory, (iii) the  
(virtual) memory system which is conceptually like a very large  
array. So from this point of view there is nothing strong or weak  
about names vs pointers -- a pointer is just one kind of name in one  
kind of naming system. The problems are basically the same for each  
kind of system: e.g. having references that don't correspond to a  
valid object (invalid pointer, non existent variable, etc), or  
looking up a valid reference but in the wrong context (hashtable,  
namespace, etc). One difference is that the "map" from pointers to  
objects (i.e. the memory system) is usually fixed and global:  
dereferencing a pointer doesn't usually depend on context, whereas a  
string variable name obviously does. I'm not sure if this is the  
distinction you are making between strong and weak references  
however. To me, the distinction seems better applied to referencing  
systems as a whole. For instance, some statically-typed languages  
have a very strong notion of references: as with pointers, the thing  
refered to is fixed and not dependent on context, but further it is  
usually impossible to manufacture a reference independently of the  
reference system (the type system prevents this), and it is possible  
to accurately track what refers to what in the system and manage  
lifecycles effectively. String names as references are obviously  
"weaker" in this sense: it is possible to manufacture strings at  
will, and perform all sorts of manipulations on them, which makes it  
very hard to track what refers to what, or e.g. if there are still  
some references left to some object.

[...]
>  Strong references are bound to the internal value held by the variable
>  at the time it is defined, and may be shared and passed around  
>  contexts.
>  So a referenced value remains valid as long as a strong reference  
>  points
>  to it. When the last one disappears the underlying value is garbage
>  collected. Variables can be seen as named strong references.

OK, just to be clear: a variable (Var structure in terms of  
implementation) holds a "strong" reference to a value (Tcl_Obj). That  
variable is then named in one (or more) contexts/namespaces at the  
script level -- i.e., there may exist several "weak" references to  
the variable. Thus there are two levels of reference here: the Var- 
 > Tcl_Obj level and the Name (string)->Var level.

>  Tcl
>  ===
> 
>  At the script level, Tcl only provides weak references using variable
>  names. Commands such as [set], [lappend] or [incr] access variable by
>  name, and may create the variable if it doesn't exist (this is a  
>  recent
>  feature of [incr]). Moreover, [global] and [upvar] can access  
>  variables
>  by name from other contexts.

Clarification: [set] etc may create *a* variable if the name given  
doesn't map to an existing variable in the current context (the  
intended variable may exist in a different context).

> 
>  Tcl provides no way to create strong references at the script level,
>  however it uses strong references internally using refcounted Tcl_Obj
>  structures. Exposing strong references at the script level involves
>  hacks with object types that are prone to failure because of  
>  shimmering.

Yes. "Everything is a string" is fundamentally incompatible with the  
idea of strong references, as far as I can tell (i.e. it is  
impossible to create a properly abstracted system of reference using  
strings as references). Tcl_Obj internal rep hacks are not just prone  
to failure, but also break EIAS.

>  What Cloverfield needs
>  ======================
> 
>  Cloverfield needs both kinds of references. The original
>  Tridekalogue introduces $& as a syntax for strong references. But it
>  must also keep existing Tcl semantics regarding variable names, as  
>  they
>  are an essential part of its philosophy. Variable names have no
>  alternative when using introspection or designing mini-languages,
>  a field where Tcl shines.

I need some clarification about what $& notation introduces exactly.  
As mentioned earlier, there are two levels to Tcl (and presumably  
Cloverfield) variable references: the "strong" reference of a Var to  
the value it refers to, and the "weak" references of any string names  
that refer to that var in various contexts. From this, my expectation  
would be that $&foo syntax is a way of denoting the var structure  
rather than the value that it refers to. So e.g. while [foo $bar]  
means 'call foo passing the value contained in the variable refered  
to by the name "bar" in the current context', the new syntax [foo  
$&bar] means something like 'call foo passing the variable refered to  
by the name "bar" in the current context' -- where "passing" a  
variable would presumably mean linking the first parameter of foo to  
the same variable that bar refers to, i.e. that:

     proc foo v { ... }
     foo $&bar

is roughly equivalent to:

     proc foo vName { upvar 1 $vName v; ... }
     foo bar

Is that correct?
>  [...]
>  Proposals
>  =========
> 
>  Andy Goth proposed to allow references to non-existing variables. This
>  has the effect of delaying the resolution of the variable until the
>  first access. This means that they are weak references. Alternatively,
>  Andy also proposed a new syntax for weak references using the @  
>  prefix.

What then is the difference between these weak references and just  
names? Are they looked up in a different context/namespace?

>  I proposed to keep references strong, and extend grouping rules of {}
>  and () so that variable syntax doesn't clash with other rules. But  
>  this
>  raises other questions, notably on the order of substitution, so it
>  doesn't solve the whole issue.
> 
>  Alexandre Ferrieux raised a good point about implementation
>  considerations, namely: "explicit reference syntax is a good thing ...
>  for the bytecode (or whatever) compiler". Indeed, access-by-name is a
>  performance killer in the sense that the compiler has no clue  
>  whether a
>  given argument can be used as a variable name by the called  
>  command. So
>  this requires ugly hardwiring of commands such as [set] into the  
>  bytecode.

For this particular application, the best solution then would seem to  
be having the syntax be part of the declaration of the proc itself so  
that the byte-code compiler knows the intention by just looking at  
that rather than looking at usage. i.e.:

     proc foo {&bar ..} { ... }

I believe there is code on the wiki that implements this as sugar for  
[upvar]. If you adopted this convention in Cloverfield instead of  
explicit upvar (or perhaps in addition to), then that would seem to  
solve the compiler optimisation issue, wouldn't it?

[...]

To summarise, my point is basically that Tcl already has "strong"  
references in the form of the Var structures that underlie variables,  
and it has script-accessible means of manipulating these and linking  
them via [set], [upvar] etc. I don't therefore think that adding a  
new notion of reference at the script level would clarify things, but  
rather just complicate them. Some things I think could be done:

1.) Provide some syntactic support for linking variables such as that  
outlined above, which would perhaps make a common idiom (upvar 1 ..)  
slightly easier to grasp and use, and possibly allow for further  
optimisations.
2.) Generalise the notion of what can be referred to by a Var to that  
of a general "resource" -- i.e., not just Tcl_Objs, but also "opaque"  
entities like commands, channels, objects, etc. This would allow for  
finer control over the lifetime/scope of these resources (e.g. proc- 
local commands or channels), and could be extended to encompass  
things like general reference-counting of resources and perhaps even  
a generalised serialisation framework (i.e. $foo means "attempt to  
serialise the resource referred to by the var named foo").

Anyway, I hope some of that is of interest or help to you. I'm not  
sure if this is particular on-topic for TCLCORE, but I'm not a member  
of the Cloverfield list -- feel free to follow up to my personal  
email if you want to discuss anything further.

-- Neil
This message has been checked for viruses but the contents of an attachment
may still contain software viruses, which could damage your computer system:
you are advised to perform your own checks. Email communications with the
University of Nottingham may be monitored as permitted by UK legislation.


-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
_______________________________________________
Tcl-Core mailing list
Tcl-Core@[...].net
https://lists.sourceforge.net/lists/listinfo/tcl-core
Thread:
fbonnet
Neil Madden

Privacy Policy | Email Opt-out | Feedback | Syndication
© 2004 ActiveState, a division of Sophos All rights reserved