ASPN ActiveState Programmer Network
ActiveState
/ Home / Perl / PHP / Python / Tcl / XSLT /
/ Safari / My ASPN /
Cookbooks | Documentation | Mailing Lists | Modules | News Feeds | Products | User Groups


Recent Messages
List Archives
About the List
List Leaders
Subscription Options

View Subscriptions
Help

View by Topic
ActiveState
.NET Framework
Open Source
Perl
PHP
Python
Tcl
Web Services
XML & XSLT

View by Category
Database
General
SOAP
System Administration
Tools
User Interfaces
Web Programming
XML Programming


MyASPN >> Mail Archive >> php-dev
php-dev
RE: [PHP-DEV] [PATCH] [RFC] Closures and lambda functions in PHP
by Andi Gutmans other posts by this author
Jun 17 2008 11:02PM messages near this date
Re: [PHP-DEV] [PATCH] [RFC] Closures and lambda functions in PHP | Re: [PHP-DEV] [PATCH] [RFC] Closures and lambda functions in PHP
Hi Christian,

This is a very nice piece of work. Definitely addresses a lot of the issues we have raised i
n the past.
I would like to see such a solution make its way into PHP (see below re: timing).	

There are some things I'd like to consider:
1) I am not sure that the current semantics of the "lexical" keyword is great in all cases. 
Is the reason why you don't allow by-value binding so that we don't have to manage more than
 one lambda instance per declaration?
2) [minor curiosity - do we want to consider reusing "parent" instead of "lexical"? I guess 
that could be confusing but it's not the first time we reuse a keyword when it's clear that 
the usage is in two different places (this is minor and I don't mind much either way althoug
h lexical doesn't mean too much to me).]
3) I am concerned about binding to classes. First of all we need to look into more detail wh
at the implications are for bytecode caches when changing class entries at run-time. We may 
want to also consider an option where the lambda binds to the object and only has public acc
ess although I realize that may be considered by some as too limiting. We'll review these tw
o things in the coming days.

Re: timing, I think the biggest issue we have right now with PHP 5.3 is that we are not maki
ng a clear cut on features. There's always pressure on release managers to include more (I w
ent through the same with 5.0) but at some point you just have to stop at some place or thin
gs will never go out as there are always good ideas flowing in. Unfortunately with 5.3 that 
cut isn't happening and it seems to drag out longer than needed. I prefer having this discus
sion in the context of a hard date for a beta release after which we'll be especially strict
 with accepting new features. Each new feature will drag out the beta/RC cycle as they need 
enough time for testing/feedback/tweaks.

Andi	

>  -----Original Message-----

>  From: Christian Seiler [mailto:chris_se@[...].net]

>  Sent: Monday, June 16, 2008 10:39 AM

>  To: php-dev List

>  Subject: [PHP-DEV] [PATCH] [RFC] Closures and lambda functions in PHP

>  

>  Hi,

>  

>  As a followup to the discussion in January, I'd like post a revised patch to

>  this list that implements closures and anonymous functions in PHP.

>  

>  INTRODUCTION

>  ------------

>  

>  Closures and lambda functions can make programming much easier in

>  several ways:

>  

>    1. Lambda functions allow the quick definition of throw-away functions

>       that are not used elsewhere. Imaging for example a piece of code that

>       needs to call preg_replace_callback(). Currently, there are three

>       possibilities to acchieve this:

>  

>        a. Define the callback function elsewhere. This distributes code that

>           belongs together throughout the file and decreases readability.

>  

>        b. Define the callback function in-place (but with a name). In

>  that case

>           one has to use function_exists() to make sure the function is only

>           defined once. Example code:

>  

>            <?php

>               function replace_spaces ($text) {

>                 if (!function_exists ('replace_spaces_helper')) {

>                   function replace_spaces_helper ($matches) {

>                     return str_replace ($matches[1], ' ', '&nbsp;').' ';

>                   }

>                 }

>                 return preg_replace_callback ('/( +) /',

>  'replace_spaces_helper',

>                                               $text);

>               }

>            ?>

>  

>           Here, the additional if() around the function definition makes the

>           source code difficult to read.

>  

>        c. Use the present create_function() in order to create a function at

>           runtime. This approach has several disadvantages: First of all,

>  syntax

>           highlighting does not work because a string is passed to the

>  function.

>           It also compiles the function at run time and not at compile

>  time so

>           opcode caches can't cache the function.

>  

>    2. Closures provide a very useful tool in order to make lambda

>  functions even

>       more useful. Just imagine you want to replace 'hello' through

>  'goodbye' in

>       all elements of an array. PHP provides the array_map() function which

>       accepts a callback. If you don't wan't to hard-code 'hello' and

>  'goodbye'

>       into your sourcecode, you have only four choices:

>  

>        a. Use create_function(). But then you may only pass literal values

>           (strings, integers, floats) into the function, objects at best as

>           clones (if var_export() allows for it) and resources not at

>  all. And

>           you have to worry about escaping everything correctly.

>  Especially when

>           handling user input this can lead to all sorts of security issues.

>  

>        b. Write a function that uses global variables. This is ugly,

>           non-reentrant and bad style.

>  

>        c. Create an entire class, instantiate it and pass the member function

>           as a callback. This is perhaps the cleanest solution for this

>  problem

>           with current PHP but just think about it: Creating an entire

>  class for

>           this extremely simple purpose and nothing else seems overkill.

>  

>        d. Don't use array_map() but simply do it manually (foreach). In this

>           simple case it may not be that much of an issue (because one simply

>           wants to iterate over an array) but there are cases where doing

>           something manually that a function with a callback as parameter

>  does

>           for you is quite tedious.

>  

>       [Yes, I know that str_replace also accepts arrays as a third

>  parameter so

>       this example may be a bit useless. But imagine you want to do a more

>       complex operation than simple search and replace.]

>  

>  PROPOSED PATCH

>  --------------

>  

>  I now propose a patch that implements compile-time lambda functions and

>  closures for PHP while keeping the patch as simple as possible. The patch is

>  based on a previous patch on mine which was based on ideas discussed here

>  end of December / start of January.

>  

>  Userland perspective

>  --------------------

>  

>  1. The patch adds the following syntax as a valid expression:

>  

>  function & (parameters) { body }

>  

>  (The & is optional and indicates - just as with normal functions - that the

>  anonymous function returns a reference instead of a value)

>  

>  Example usage:

>  

>  $lambda = function () { echo "Hello World!\n"; };

>  

>  The variable $lambda then contains a callable resource that may be called

>  through different means:

>  

>  $lambda ();

>  call_user_func ($lambda);

>  call_user_func_array ($lambda, array ());

>  

>  This allows for simple lambda functions, for example:

>  

>  function replace_spaces ($text) {

>     $replacement = function ($matches) {

>       return str_replace ($matches[1], ' ', '&nbsp;').' ';

>     };

>     return preg_replace_callback ('/( +) /', $replacement, $text);

>  }

>  

>  2. The patch implements closures by defining an additional keyword 'lexical'

>  that allows an lambda function (and *only* an lambda function) to import

>  a variable from the "parent scope" to the lambda function scope. Example:

>  

>  function replace_in_array ($search, $replacement, $array) {

>     $map = function ($text) {

>       lexical $search, $replacement;

>       if (strpos ($text, $search) > 50) {

>         return str_replace ($search, $replacement, $text);

>       } else {

>         return $text;

>       }

>     };

>     return array_map ($map, array);

>  }

>  

>  The variables $search and $replacement are variables in the scope of the

>  function replace_in_array() and the lexical keyword imports these variables

>  into the scope of the closure. The variables are imported as a reference,

>  so any change in the closure will result in a change in the variable of the

>  function itself.

>  

>  3. If a closure is defined inside an object, the closure has full access

>  to the current object through $this (without the need to use 'lexical' to

>  import it seperately) and all private and protected methods of that class.

>  This also applies to nested closures. Essentially, closures inside

>  methods are

>  added as public methods to the class that contains the original method.

>  

>  4. Closures may live longer as the methods that declared them. It is

>  perfectly

>  possible to have something like this:

>  

>  function getAdder($x) {

>     return function ($y) {

>       lexical $x;

>       return $x + $y;

>     };

>  }

>  

>  Zend internal perspective

>  -------------------------

>  

>  The patch basically changes the following in the Zend engine:

>  

>  When the compiler reaches a lambda function, it creates a unique name

>  for that

>  function ("\0__compiled_lambda_FILENAME_N" where FILENAME is the name of the

>  file currently processed and N is a per-file counter). The use of the

>  filename

>  in the function name ensures compability with opcode caches. The lambda

>  function is then immediately added to the function table (either the global

>  function table or that of the current class if declared inside a class

>  method).

>  Instead of a normal ZEND_DECLARE_FUNCTION opcode the new

>  ZEND_DECLARE_LAMBDA_FUNC is used as an opcode at this point. The op_array

>  of the new function is initialized with is_lambda = 1 and is_closure = 0.

>  

>  When parsing a 'lexical' declaration inside an anonymous function the parser

>  saves the name of the variable that is to be imported in an array stored

>  as a member of the op_array structure (lexical_names).

>  

>  The opcode handler for ZEND_DECLARE_LAMBDA_FUNC does the following: First of

>  all it creates a new op_array and copies the entire memory structure of the

>  lambda function into it (the opcodes themselves are not copied since they

>  are only referenced in the op_array structure). Then it sets is_closure = 1

>  on the new op_array, and for each lexical variable name that the compiler

>  added to the original op_array it creates a reference to that variable from

>  the current scope into a HashTable member in the new op_array. It also saves

>  the current object pointer ($this) as a member of the op_array in order to

>  allow for the closure to access $this. Finally it registers the new op_array

>  as a resource and returns that resource.

>  

>  The opcode handler of the 'lexical' construct simply fetches the variable

>  from that HashTable and imports it into local scope of the inner function

>  (just like with 'global' only with a different hash table).

>  

>  Some hooks were added that allow the 'lambda function' resource to be

>  called.

>  Also, there are several checks in place that make sure the lambda function

>  is not called directly, i.e. if someone explicitely tries to use the

>  internal

>  function name instead of using the resource return value of the declaration.

>  

>  The patch

>  ---------

>  

>  The patch is available here:

>  <http://www.christian-seiler.de/temp/closures-php-5.3-2008-06-16-1.diff>

>  

>  Please note that I did NOT include the contents of zend_language_scanner.c

>  in the patch since that can easily be regenerated and just takes up enormous

>  amounts of space.

>  

>  The patch itself applies against the 5.3 branch of PHP.

>  

>  If I understand the discussion regarding PHP6 on this list correctly, some

>  people are currently undergoing the task of removing the unicode_semantics

>  switch and if (UG(unicode)). As soon as this task is finished I will also

>  provide a patch for CVS HEAD (it doesn't make much sense adopting the patch

>  now and then having to change it again completely afterwards).

>  

>  BC BREAKS

>  ---------

>  

>    * Introduction of a new keyword 'lexical'. Since it is very improbable

>  that

>      someone should use it as a function, method, class or property name, I

>      think this is an acceptable break.

>  

>  Other that that, I can find no BC breaks of my patch.

>  

>  CAVEATS / POSSIBLE WTFS

>  -----------------------

>  

>    * On writing $func = function () { }; there is a semicolon necessary.

>  If left

>      out it will produce a compile error. Since any attempt to remove that

>      necessity would unecessarily bloat the grammar, I suggest we simply keep

>      it the way it is. Also, Lukas Kahwe Smith pointed out that a single

>      trailing semicolon after a closing brace already exists: do { }

>  while ();

>  

>    * The fact that 'lexical' creates references may cause certain WTFs:

>  

>        for ($i = 0; $i < 10; $i++) {

>          $arr[$i] = function () { lexical $i; return $i; };

>        }

>  

>      This will not work as expected since $i is a reference and thus all

>      created closures would reference the same variable. In order to get this

>      right one has to do:

>  

>        for ($i = 0; $i < 10; $i++) {

>          $loopIndex = $i;

>          $arr[$i] = function () { lexical $loopIndex; return $loopIndex; };

>          unset ($loopIndex);

>        }

>  

>      This can be a WTF for people that don't expect lexical to create an

>      actual reference, especially since other languages such as JavaScript

>      don't do it. On the other hand, global and static both DO create

>      references so that behaviour is consistent with current PHP.

>  

>      But complex constructions such as this will probably not be used by

>      beginners so maintaining a good documentation should solve this.

>  

>    * The fact that 'lexical' is needed at all may cause WTFs. Other languages

>      such as JavaScript implicitely have the entire scope visible to child

>      functions. But since PHP does the same thing with global variables, I

>      find a keyword like 'lexical' much more consistent than importing the

>      entire scope (and always importing the entire scope costs unnecessary

>      performance).

>  

>  FINAL THOUGHTS

>  --------------

>  

>  My now proposed patch addresses the two main problems of my previous patch:

>  Support for closures in objects (with access to $this) and opcode caches. My

>  patch applies against PHP_5_3 and does not break any tests. It adds a

>  valuable

>  new language feature which I'd like to see in PHP.

>  

>  Regards,

>  Christian

>  

>  --

>  PHP Internals - PHP Runtime Development Mailing List

>  To unsubscribe, visit: http://www.php.net/unsub.php


Thread:
Christian Seiler
Wez Furlong
Dmitry Stogov
Stanislav Malyshev
Alexander Wagner
Dmitry Stogov
Andi Gutmans
Christian Seiler
Lukas Kahwe Smith
Sebastian Bergmann
Marcus Boerger
Markus Fischer
Troels Knak-Nielsen
Christian Seiler
Dmitry Stogov
Larry Garfield
Christian Seiler
Dmitry Stogov
Christian Seiler
Lars Strojny
Stanislav Malyshev
Marcus Boerger
Lars Strojny
Troels Knak-Nielsen
Larry Garfield
Marcus Boerger
Dmitry Stogov
Andi Gutmans
Alexander Wagner
Andi Gutmans
Alexander Wagner
Alexander Wagner
Christian Seiler
Alexander Wagner
Lars Strojny
Dmitry Stogov
Marcus Boerger
Lars Strojny
Dmitry Stogov
Alexey Zakhlestin
Federico Lebron
Dmitry Stogov
Rodrigo Saboya
lenar
Larry Garfield
Stanislav Malyshev
Marcus Boerger
Alexander Wagner
Lars Strojny
Larry Garfield
Robert Cummings
Rodrigo Saboya
Alexander Wagner
Christian Seiler
Chris Stockton
Alexander Wagner
Troels Knak-Nielsen
Andi Gutmans
Marcus Boerger
Christian Seiler
Lukas Kahwe Smith
Gwynne Raskind
Andi Gutmans
Christian Seiler
Stanislav Malyshev
Kalle Sommer Nielsen
Troels Knak-Nielsen
Lars Strojny
Alexander Wagner
Stanislav Malyshev
Alexander Wagner
Andi Gutmans
Marcus Boerger
Kalle Sommer Nielsen
Troels Knak-Nielsen
Stanislav Malyshev
Alexey Zakhlestin
Chris Stockton
Alexey Zakhlestin
Gwynne Raskind
Stanislav Malyshev
Christian Seiler
Gwynne Raskind
Stanislav Malyshev
Richard Quadling
Christopher Jones
Marcus Boerger
Steph Fox
Christian Seiler
Marcus Boerger
Stanislav Malyshev
Lars Strojny
Christian Seiler
Stanislav Malyshev
Marcus Boerger
Marcus Boerger
Andrei Zmievski
Stanislav Malyshev
Stanislav Malyshev
Alexey Zakhlestin
Chris Stockton
Christian Seiler
Christian Seiler
Larry Garfield
Edward Z. Yang
Christian Seiler
Larry Garfield
Christian Seiler
Nathan Nobbe
Christian Seiler
Alexey Zakhlestin
Larry Garfield
Philip Olson

Privacy Policy | Email Opt-out | Feedback | Syndication
© ActiveState Software Inc. All rights reserved