ASPN ActiveState Programmer Network
ActiveState
/ Home / Perl / PHP / Python / Tcl / XSLT /
/ Safari / My ASPN /
Cookbooks | Documentation | Mailing Lists | Modules | News Feeds | Products | User Groups


Recent Messages
List Archives
About the List
List Leaders
Subscription Options

View Subscriptions
Help

View by Topic
ActiveState
.NET Framework
Open Source
Perl
PHP
Python
Tcl
Web Services
XML & XSLT

View by Category
Database
General
SOAP
System Administration
Tools
User Interfaces
Web Programming
XML Programming


MyASPN >> Mail Archive >> tcl-core
tcl-core
[TCLCORE] Variable access (was Re: [Tcl9-cloverfield] Parser)
by fbonnet other posts by this author
May 7 2008 6:02AM messages near this date
Re: [TCLCORE] OOhhhh crap | Re: [TCLCORE] Variable access (was Re: [Tcl9-cloverfield] Parser)
[Given the importance of this discussion, I'm crossposting on TCLCORE so 
that people outside of the original discussion may express their 
preferences]

There are some very good points on all sides of this discussion, so I'll
try to summarize it a bit.


Variable access semantics.
==========================

Variables can be accessed either by value or by reference. Regarding the
latter, there are two main kinds of references:

- Weak references, e.g. access-by-name.
- Strong references, such as pointers.

Variable substitution using the $ syntax requires that the variable
exists at the time of substitution. However weak and strong references
may differ.

Weak references are not tied to an internal value but to its externally
accessible representation, i.e. its name. The actual variable may not
exist when the reference is created, and it may be changed or deleted at
any moment. Variable resolution only occurs on access.

Strong references are bound to the internal value held by the variable
at the time it is defined, and may be shared and passed around contexts. 
So a referenced value remains valid as long as a strong reference points 
to it. When the last one disappears the underlying value is garbage 
collected. Variables can be seen as named strong references.

There exists the same distinctions between hard and symbolic links on 
file systems.


Tcl
===

At the script level, Tcl only provides weak references using variable 
names. Commands such as [set], [lappend] or [incr] access variable by 
name, and may create the variable if it doesn't exist (this is a recent 
feature of [incr]). Moreover, [global] and [upvar] can access variables 
by name from other contexts.

Tcl provides no way to create strong references at the script level,
however it uses strong references internally using refcounted Tcl_Obj
structures. Exposing strong references at the script level involves
hacks with object types that are prone to failure because of shimmering.


What Cloverfield needs
======================

Cloverfield needs both kinds of references. The original
Tridekalogue introduces $& as a syntax for strong references. But it
must also keep existing Tcl semantics regarding variable names, as they
are an essential part of its philosophy. Variable names have no
alternative when using introspection or designing mini-languages,
a field where Tcl shines.

Cloverfield obviously already allows weak references in the form of
variable names, since any word can hold such a name. However the new
indexing syntax sometimes clashes with the other rules, especially
regarding word boundaries. The $ notation of course takes precedence
above the other rules, but it involves either substitution or strong
reference, which both require the variable to exist. Consequently, it
can't be used with commands using declarative syntax, such as [set]. So
using complex variable name patterns may involve quoting, which breaks
the visual consistency of the language.


Proposals
=========

Andy Goth proposed to allow references to non-existing variables. This 
has the effect of delaying the resolution of the variable until the 
first access. This means that they are weak references. Alternatively, 
Andy also proposed a new syntax for weak references using the @ prefix.

I proposed to keep references strong, and extend grouping rules of {} 
and () so that variable syntax doesn't clash with other rules. But this 
raises other questions, notably on the order of substitution, so it 
doesn't solve the whole issue.

Alexandre Ferrieux raised a good point about implementation 
considerations, namely: "explicit reference syntax is a good thing ... 
for the bytecode (or whatever) compiler". Indeed, access-by-name is a 
performance killer in the sense that the compiler has no clue whether a 
given argument can be used as a variable name by the called command. So 
this requires ugly hardwiring of commands such as [set] into the bytecode.

It becomes clear that Cloverfield needs explicit syntax for both weak
and strong reference, but must also allows the traditional
access-by-name semantics for a wide range of applications, such as
declarative syntax in mini languages or object systems. So explicit weak
reference syntax should not be compulsory.

Therefore, let me express the following proposal:

1. Commands may accept either variable names, weak or strong references.

2. Access-by-name and weak references are semantically identical, and
occur in the current state of the application. IOW, a weak reference
isn't bound to a given variable, the latter may be destroyed and
recreated between accesses (typical case: local variables).

3. Strong references resolve in the creation context but persist even
if the variable is destroyed or recreated. In the latter case the
reference still points to the original value.

4. Strong references use the already existing syntax $&.

5. Weak references use the new syntax $@. I'm against using the sole @,
as it creates a new substitution rule.

6. Best practices should limit access-by-name to simple cases, e.g.
variable names without index part, in order to avoid clashes with other
syntax rules. However, an existing index part should be correctly
interpreted (this is the way Tcl handles array elements).


An alternative to #6 is to limit access-by-name to the base name of the
variable, which totally makes sense in a declarative context. In this
case, indexed access would require an explicit substitution or
reference. This has the added benefit of resolving all existing
ambiguities in Tcl regarding variable names and array elements (e.g. the
name1/name2 part in traces or in the API), as "arrays" in Cloverfield
are no longer collections of variables but keyed access semantics to a
single dictionary variable. However code like "set a(b) c" would no 
longer be accepted, as it would designate the variable named "a(b)" and 
not the element b of dictionary variable a. The former semantics would 
require "set $@a(b) c". This is a problem since it's a very idiomatic 
piece of code. So either the full variable access syntax must be 
accepted, or it may be split into name and index parts.


What this means at the script level
===================================

References blur the existing name/value boundary that led to the 
duplication of whole sets of operations (for example lappend vs. 
concat), some using pass-by-name, other using pass-by-value:

% # Tcl
% set d [dict create 1 a 2 b]
% dict remove $d 1 ; # Pass-by-value.
2 b
% set d
1 a 2 b
% dict unset d 1 ; # Pass-by-name
2 b
% set d
2 b


% # Using Cloverfield references.
% set d [dict create 1 a 2 b]
% dict remove $d 1 ; # Pass-by-value.
2 b
% set d
1 a 2 b
% dict remove $&d 1 ; # Pass-by-reference.
2 b
% set d
2 b


Moreover the distinction between weak and strong references means that 
the user is in control of variable initialization. For example, Tcl 
8.4's [incr] requires variables to exist before incrementing them. Tcl 
8.5 implicitly initialize non-existing variables to zero. With weak vs. 
strong references, this behavior can be more explicit. This can avoid 
silent bugs where variables are implicitly created whereas they shouldn't.

-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
_______________________________________________
Tcl-Core mailing list
Tcl-Core@[...].net
https://lists.sourceforge.net/lists/listinfo/tcl-core
Thread:
fbonnet
Neil Madden

Privacy Policy | Email Opt-out | Feedback | Syndication
© 2004 ActiveState, a division of Sophos All rights reserved