ASPN ActiveState Programmer Network
  ActiveState
/ Home / Perl / PHP / Python / Tcl / XSLT /
/ Safari / My ASPN /
Cookbooks | Documentation | Mailing Lists | Modules | News Feeds | Products | User Groups | Web Services
SEARCH
advanced | search help

Reference
ActivePerl 5.10
Core Documentation
perl
perlintro
perltoc
perlreftut
perldsc
perllol
perlrequick
perlretut
perlboot
perltoot
perltooc
perlbot
perlstyle
perlcheat
perltrap
perldebtut
perlfaq
perlfaq1
perlfaq2
perlfaq3
perlfaq4
perlfaq5
perlfaq6
perlfaq7
perlfaq8
perlfaq9
perlsyn
perldata
perlop
perlsub
perlfunc
perlopentut
perlpacktut
perlpod
perlpodspec
perlrun
perldiag
perllexwarn
perldebug
perlvar
perlre
perlrebackslash
perlrecharclass
perlreref
perlref
perlform
perlobj
perltie
perldbmfilter
perlipc
perlfork
perlnumber
perlthrtut
perlothrtut
perlport
perllocale
perluniintro
perlunicode
perlunifaq
perlunitut
perlebcdic
perlsec
perlmod
perlmodlib
perlmodstyle
perlmodinstall
perlnewmod
perlpragma
perlutil
perlcompile
perlfilter
perlglossary
perlembed
perldebguts
perlxstut
perlxs
perlclib
perlguts
perlcall
perlreapi
perlreguts
perlapi
perlintern
perliol
perlapio
perlhack
perlbook
perlcommunity
perltodo
perldoc
perlhist
perldelta
perl5100delta
perl595delta
perl594delta
perl593delta
perl592delta
perl591delta
perl590delta
perl588delta
perl587delta
perl586delta
perl585delta
perl584delta
perl583delta
perl582delta
perl581delta
perl58delta
perl573delta
perl572delta
perl571delta
perl570delta
perl561delta
perl56delta
perl5005delta
perl5004delta
perlartistic
perlgpl
perlcn
perljp
perlko
perltw
perlaix
perlamiga
perlapollo
perlbeos
perlbs2000
perlce
perlcygwin
perldgux
perldos
perlepoc
perlfreebsd
perlhpux
perlhurd
perlirix
perllinux
perlmachten
perlmacos
perlmacosx
perlmint
perlmpeix
perlnetware
perlopenbsd
perlos2
perlos390
perlos400
perlplan9
perlqnx
perlriscos
perlsolaris
perlsymbian
perltru64
perluts
perlvmesa
perlvms
perlvos
perlwin32

MyASPN >> Reference >> ActivePerl 5.10 >> Core Documentation
ActivePerl 5.10 documentation

perlothrtut - old tutorial on threads in Perl


NAME

perlothrtut - old tutorial on threads in Perl


DESCRIPTION

WARNING: This tutorial describes the old-style thread model that was introduced in release 5.005. This model is deprecated, and has been removed for version 5.10. The interfaces described here were considered experimental, and are likely to be buggy.

For information about the new interpreter threads ("ithreads") model, see the perlthrtut tutorial, and the the threads manpage and the threads::shared manpage modules.

You are strongly encouraged to migrate any existing threads code to the new model as soon as possible.


What Is A Thread Anyway?

A thread is a flow of control through a program with a single execution point.

Sounds an awful lot like a process, doesn't it? Well, it should. Threads are one of the pieces of a process. Every process has at least one thread and, up until now, every process running Perl had only one thread. With 5.005, though, you can create extra threads. We're going to show you how, when, and why.


Threaded Program Models

There are three basic ways that you can structure a threaded program. Which model you choose depends on what you need your program to do. For many non-trivial threaded programs you'll need to choose different models for different pieces of your program.

Boss/Worker

The boss/worker model usually has one `boss' thread and one or more `worker' threads. The boss thread gathers or generates tasks that need to be done, then parcels those tasks out to the appropriate worker thread.

This model is common in GUI and server programs, where a main thread waits for some event and then passes that event to the appropriate worker threads for processing. Once the event has been passed on, the boss thread goes back to waiting for another event.

The boss thread does relatively little work. While tasks aren't necessarily performed faster than with any other method, it tends to have the best user-response times.

Work Crew

In the work crew model, several threads are created that do essentially the same thing to different pieces of data. It closely mirrors classical parallel processing and vector processors, where a large array of processors do the exact same thing to many pieces of data.

This model is particularly useful if the system running the program will distribute multiple threads across different processors. It can also be useful in ray tracing or rendering engines, where the individual threads can pass on interim results to give the user visual feedback.

Pipeline

The pipeline model divides up a task into a series of steps, and passes the results of one step on to the thread processing the next. Each thread does one thing to each piece of data and passes the results to the next thread in line.

This model makes the most sense if you have multiple processors so two or more threads will be executing in parallel, though it can often make sense in other contexts as well. It tends to keep the individual tasks small and simple, as well as allowing some parts of the pipeline to block (on I/O or system calls, for example) while other parts keep going. If you're running different parts of the pipeline on different processors you may also take advantage of the caches on each processor.

This model is also handy for a form of recursive programming where, rather than having a subroutine call itself, it instead creates another thread. Prime and Fibonacci generators both map well to this form of the pipeline model. (A version of a prime number generator is presented later on.)


Native threads

There are several different ways to implement threads on a system. How threads are implemented depends both on the vendor and, in some cases, the version of the operating system. Often the first implementation will be relatively simple, but later versions of the OS will be more sophisticated.

While the information in this section is useful, it's not necessary, so you can skip it if you don't feel up to it.

There are three basic categories of threads-user-mode threads, kernel threads, and multiprocessor kernel threads.

User-mode threads are threads that live entirely within a program and its libraries. In this model, the OS knows nothing about threads. As far as it's concerned, your process is just a process.

This is the easiest way to implement threads, and the way most OSes start. The big disadvantage is that, since the OS knows nothing about threads, if one thread blocks they all do. Typical blocking activities include most system calls, most I/O, and things like sleep().

Kernel threads are the next step in thread evolution. The OS knows about kernel threads, and makes allowances for them. The main difference between a kernel thread and a user-mode thread is blocking. With kernel threads, things that block a single thread don't block other threads. This is not the case with user-mode threads, where the kernel blocks at the process level and not the thread level.

This is a big step forward, and can give a threaded program quite a performance boost over non-threaded programs. Threads that block performing I/O, for example, won't block threads that are doing other things. Each process still has only one thread running at once, though, regardless of how many CPUs a system might have.

Since kernel threading can interrupt a thread at any time, they will uncover some of the implicit locking assumptions you may make in your program. For example, something as simple as $a = $a + 2 can behave unpredictably with kernel threads if $a is visible to other threads, as another thread may have changed $a between the time it was fetched on the right hand side and the time the new value is stored.

Multiprocessor Kernel Threads are the final step in thread support. With multiprocessor kernel threads on a machine with multiple CPUs, the OS may schedule two or more threads to run simultaneously on different CPUs.

This can give a serious performance boost to your threaded program, since more than one thread will be executing at the same time. As a tradeoff, though, any of those nagging synchronization issues that might not have shown with basic kernel threads will appear with a vengeance.

In addition to the different levels of OS involvement in threads, different OSes (and different thread implementations for a particular OS) allocate CPU cycles to threads in different ways.

Cooperative multitasking systems have running threads give up control if one of two things happen. If a thread calls a yield function, it gives up control. It also gives up control if the thread does something that would cause it to block, such as perform I/O. In a cooperative multitasking implementation, one thread can starve all the others for CPU time if it so chooses.

Preemptive multitasking systems interrupt threads at regular intervals while the system decides which thread should run next. In a preemptive multitasking system, one thread usually won't monopolize the CPU.

On some systems, there can be cooperative and preemptive threads running simultaneously. (Threads running with realtime priorities often behave cooperatively, for example, while threads running at normal priorities behave preemptively.)


What kind of threads are perl threads?

If you have experience with other thread implementations, you might find that things aren't quite what you expect. It's very important to remember when dealing with Perl threads that Perl Threads Are Not X Threads, for all values of X. They aren't POSIX threads, or DecThreads, or Java's Green threads, or Win32 threads. There are similarities, and the broad concepts are the same, but if you start looking for implementation details you're going to be either disappointed or confused. Possibly both.

This is not to say that Perl threads are completely different from everything that's ever come before--they're not. Perl's threading model owes a lot to other thread models, especially POSIX. Just as Perl is not C, though, Perl threads are not POSIX threads. So if you find yourself looking for mutexes, or thread priorities, it's time to step back a bit and think about what you want to do and how Perl can do it.


Threadsafe Modules

The addition of threads has changed Perl's internals substantially. There are implications for people who write modules--especially modules with XS code or external libraries. While most modules won't encounter any problems, modules that aren't explicitly tagged as thread-safe should be tested before being used in production code.

Not all modules that you might use are thread-safe, and you should always assume a module is unsafe unless the documentation says otherwise. This includes modules that are distributed as part of the core. Threads are a beta feature, and even some of the standard modules aren't thread-safe.

If you're using a module that's not thread-safe for some reason, you can protect yourself by using semaphores and lots of programming discipline to control access to the module. Semaphores are covered later in the article. Perl Threads Are Different


Thread Basics

The core Thread module provides the basic functions you need to write threaded programs. In the following sections we'll cover the basics, showing you what you need to do to create a threaded program. After that, we'll go over some of the features of the Thread module that make threaded programming easier.

Basic Thread Support

Thread support is a Perl compile-time option-it's something that's turned on or off when Perl is built at your site, rather than when your programs are compiled. If your Perl wasn't compiled with thread support enabled, then any attempt to use threads will fail.

Remember that the threading support in 5.005 is in beta release, and should be treated as such. You should expect that it may not function entirely properly, and the thread interface may well change some before it is a fully supported, production release. The beta version shouldn't be used for mission-critical projects. Having said that, threaded Perl is pretty nifty, and worth a look.

Your programs can use the Config module to check whether threads are enabled. If your program can't run without them, you can say something like:

  $Config{usethreads} or die "Recompile Perl with threads to run this program.";

A possibly-threaded program using a possibly-threaded module might have code like this:

    use Config; 
    use MyMod;
    if ($Config{usethreads}) { 
        # We have threads 
        require MyMod_threaded; 
        import MyMod_threaded; 
    } else { 
        require MyMod_unthreaded; 
        import MyMod_unthreaded; 
    }

Since code that runs both with and without threads is usually pretty messy, it's best to isolate the thread-specific code in its own module. In our example above, that's what MyMod_threaded is, and it's only imported if we're running on a threaded Perl.

Creating Threads

The Thread package provides the tools you need to create new threads. Like any other module, you need to tell Perl you want to use it; use Thread imports all the pieces you need to create basic threads.

The simplest, straightforward way to create a thread is with new():

    use Thread;
    $thr = Thread->new( \&sub1 );
    sub sub1 { 
        print "In the thread\n"; 
    }

The new() method takes a reference to a subroutine and creates a new thread, which starts executing in the referenced subroutine. Control then passes both to the subroutine and the caller.

If you need to, your program can pass parameters to the subroutine as part of the thread startup. Just include the list of parameters as part of the Thread::new call, like this:

    use Thread; 
    $Param3 = "foo"; 
    $thr = Thread->new( \&sub1, "Param 1", "Param 2", $Param3 );
    $thr = Thread->new( \&sub1, @ParamList );
    $thr = Thread->new( \&sub1, qw(Param1 Param2 $Param3) );
    sub sub1 { 
        my @InboundParameters = @_; 
        print "In the thread\n"; 
        print "got parameters >", join("<>", @InboundParameters), "<\n"; 
    }

The subroutine runs like a normal Perl subroutine, and the call to new Thread returns whatever the subroutine returns.

The last example illustrates another feature of threads. You can spawn off several threads using the same subroutine. Each thread executes the same subroutine, but in a separate thread with a separate environment and potentially separate arguments.

The other way to spawn a new thread is with async(), which is a way to spin off a chunk of code like eval(), but into its own thread:

    use Thread qw(async);
    $LineCount = 0;
    $thr = async { 
        while(<>) {$LineCount++}         
        print "Got $LineCount lines\n";
    };
    print "Waiting for the linecount to end\n"; 
    $thr->join; 
    print "All done\n";

You'll notice we did a use Thread qw(async) in that example. async is not exported by default, so if you want it, you'll either need to import it before you use it or fully qualify it as Thread::async. You'll also note that there's a semicolon after the closing brace. That's because async() treats the following block as an anonymous subroutine, so the semicolon is necessary.

Like eval(), the code executes in the same context as it would if it weren't spun off. Since both the code inside and after the async start executing, you need to be careful with any shared resources. Locking and other synchronization techniques are covered later.

Giving up control

There are times when you may find it useful to have a thread explicitly give up the CPU to another thread. Your threading package might not support preemptive multitasking for threads, for example, or you may be doing something compute-intensive and want to make sure that the user-interface thread gets called frequently. Regardless, there are times that you might want a thread to give up the processor.

Perl's threading package provides the yield() function that does this. yield() is pretty straightforward, and works like this:

    use Thread qw(yield async); 
    async { 
        my $foo = 50; 
        while ($foo--) { print "first async\n" }
        yield; 
        $foo = 50; 
        while ($foo--) { print "first async\n" } 
    }; 
    async { 
        my $foo = 50; 
        while ($foo--) { print "second async\n" }
        yield; 
        $foo = 50; 
        while ($foo--) { print "second async\n" } 
    };

Waiting For A Thread To Exit

Since threads are also subroutines, they can return values. To wait for a thread to exit and extract any scalars it might return, you can use the join() method.

    use Thread; 
    $thr = Thread->new( \&sub1 );
    @ReturnData = $thr->join; 
    print "Thread returned @ReturnData";
    sub sub1 { return "Fifty-six", "foo", 2; }

In the example above, the join() method returns as soon as the thread ends. In addition to waiting for a thread to finish and gathering up any values that the thread might have returned, join() also performs any OS cleanup necessary for the thread. That cleanup might be important, especially for long-running programs that spawn lots of threads. If you don't want the return values and don't want to wait for the thread to finish, you should call the detach() method instead. detach() is covered later in the article.

Errors In Threads

So what happens when an error occurs in a thread? Any errors that could be caught with eval() are postponed until the thread is joined. If your program never joins, the errors appear when your program exits.

Errors deferred until a join() can be caught with eval():

    use Thread qw(async); 
    $thr = async {$b = 3/0};   # Divide by zero error
    $foo = eval {$thr->join}; 
    if ($@)