Multithreaded programming Theory and practice

what is multi threaded programming languages and multi threaded programming interview questions, multi-threaded programming advantages, multi-threaded parallel and distributed programming
Prof.WilliamsHibbs Profile Pic
Prof.WilliamsHibbs,United States,Teacher
Published Date:28-07-2017
Your Website URL(Optional)
Comment
CHAPTER Multithreaded Programming With Python you can start a thread, but you can’t stop it. Sorry. You’ll have to wait until it reaches the end of execution. So, just the same as comp.lang.python, then? —Cliff Wells, Steve Holden (and Timothy Delaney), February 2002 In this chapter... • Introduction/Motivation • Threads and Processes • Threads and Python •The thread Module •The threading Module • Comparing Single vs. Multithreaded Execution • Multithreading in Practice • Producer-Consumer Problem and the Queue/queue Module • Alternative Considerations to Threads • Related Modules 1564.1 Introduction/Motivation 157 n this section, we will explore the different ways by which you can achieve more parallelism in your code. We will begin by differentiat- I ing between processes and threads in the first few of sections of this chapter. We will then introduce the notion of multithreaded programming and present some multithreaded programming features found in Python. (Those of you already familiar with multithreaded programming can skip directly to Section 4.3.5.) The final sections of this chapter present some examples of how to use the threading and Queue modules to accomplish multithreaded programming with Python. 4.1 Introduction/Motivation Before the advent of multithreaded (MT) programming, the execution of computer programs consisted of a single sequence of steps that were exe- cuted in synchronous order by the host’s CPU. This style of execution was the norm whether the task itself required the sequential ordering of steps or if the entire program was actually an aggregation of multiple subtasks. What if these subtasks were independent, having no causal relationship (meaning that results of subtasks do not affect other subtask outcomes)? Is it not logical, then, to want to run these independent tasks all at the same time? Such parallel processing could significantly improve the perfor- mance of the overall task. This is what MT programming is all about. MT programming is ideal for programming tasks that are asynchronous in nature, require multiple concurrent activities, and where the processing of each activity might be nondeterministic, that is, random and unpredictable. Such programming tasks can be organized or partitioned into multiple streams of execution wherein each has a specific task to accomplish. Depending on the application, these subtasks might calculate intermediate results that could be merged into a final piece of output. While CPU-bound tasks might be fairly straightforward to divide into subtasks and executed sequentially or in a multithreaded manner, the task of managing a single-threaded process with multiple external sources of input is not as trivial. To achieve such a programming task without multi- threading, a sequential program must use one or more timers and imple- ment a multiplexing scheme. A sequential program will need to sample each I/O terminal channel to check for user input; however, it is important that the program does not block when reading the I/O terminal channel, because the arrival of user input is nondeterministic, and blocking would prevent processing of other I/O channels. The sequential program must use non-blocked I/O or blocked I/O with a timer (so that blocking is only temporary).158 Chapter 4 • Multithreaded Programming Because the sequential program is a single thread of execution, it must juggle the multiple tasks that it needs to perform, making sure that it does not spend too much time on any one task, and it must ensure that user response time is appropriately distributed. The use of a sequential pro- gram for this type of task often results in a complicated flow of control that is difficult to understand and maintain. Using an MT program with a shared data structure such as a Queue (a multithreaded queue data structure, discussed later in this chapter), this programming task can be organized with a few threads that have specific functions to perform: • UserRequestThread: Responsible for reading client input, perhaps from an I/O channel. A number of threads would be created by the program, one for each current client, with requests being entered into the queue. • RequestProcessor: A thread that is responsible for retrieving requests from the queue and processing them, providing output for yet a third thread. • ReplyThread: Responsible for taking output destined for the user and either sending it back (if in a networked application) or writing data to the local file system or database. Organizing this programming task with multiple threads reduces the complexity of the program and enables an implementation that is clean, efficient, and well organized. The logic in each thread is typically less com- plex because it has a specific job to do. For example, the UserRequestThread simply reads input from a user and places the data into a queue for further processing by another thread, etc. Each thread has its own job to do; you merely have to design each type of thread to do one thing and do it well. Use of threads for specific tasks is not unlike Henry Ford’s assembly line model for manufacturing automobiles. 4.2 Threads and Processes 4.2.1 What Are Processes? Computer programs are merely executables, binary (or otherwise), which reside on disk. They do not take on a life of their own until loaded into memory and invoked by the operating system. A process (sometimes called4.2 Threads and Processes 159 a heavyweight process) is a program in execution. Each process has its own address space, memory, a data stack, and other auxiliary data to keep track of execution. The operating system manages the execution of all pro- cesses on the system, dividing the time fairly between all processes. Processes can also fork or spawn new processes to perform other tasks, but each new process has its own memory, data stack, etc., and cannot gener- ally share information unless interprocess communication (IPC) is employed. 4.2.2 What Are Threads? Threads (sometimes called lightweight processes) are similar to processes except that they all execute within the same process, and thus all share the same context. They can be thought of as “mini-processes” running in par- allel within a main process or “main thread.” A thread has a beginning, an execution sequence, and a conclusion. It has an instruction pointer that keeps track of where within its context it is cur- rently running. It can be preempted (interrupted) and temporarily put on hold (also known as sleeping) while other threads are running—this is called yielding. Multiple threads within a process share the same data space with the main thread and can therefore share information or communicate with one another more easily than if they were separate processes. Threads are generally executed in a concurrent fashion, and it is this parallelism and data sharing that enable the coordination of multiple tasks. Naturally, it is impossible to run truly in a concurrent manner in a single CPU system, so threads are scheduled in such a way that they run for a little bit, then yield to other threads (going to the proverbial back of the line to await more CPU time again). Throughout the execution of the entire process, each thread performs its own, separate tasks, and communicates the results with other threads as necessary. Of course, such sharing is not without its dangers. If two or more threads access the same piece of data, inconsistent results can arise because of the ordering of data access. This is commonly known as a race condition. Fortunately, most thread libraries come with some sort of synchronization primitives that allow the thread manager to control execution and access. Another caveat is that threads cannot be given equal and fair execution time. This is because some functions block until they have completed. If not written specifically to take threads into account, this skews the amount of CPU time in favor of such greedy functions.160 Chapter 4 • Multithreaded Programming 4.3 Threads and Python In this section, we discuss how to use threads in Python. This includes the limitations of threads due to the global interpreter lock and a quick demo script. 4.3.1 Global Interpreter Lock Execution of Python code is controlled by the Python Virtual Machine (a.k.a. the interpreter main loop). Python was designed in such a way that only one thread of control may be executing in this main loop, similar to how multi- ple processes in a system share a single CPU. Many programs can be in memory, but only one is live on the CPU at any given moment. Likewise, although multiple threads can run within the Python interpreter, only one thread is being executed by the interpreter at any given time. Access to the Python Virtual Machine is controlled by the global inter- preter lock (GIL). This lock is what ensures that exactly one thread is run- ning. The Python Virtual Machine executes in the following manner in an MT environment: 1. Set the GIL 2. Switch in a thread to run 3. Execute either of the following: a. For a specified number of bytecode instructions, or b. If the thread voluntarily yields control (can be accomplished time.sleep(0)) 4. Put the thread back to sleep (switch out thread) 5. Unlock the GIL 6. Do it all over again (lather, rinse, repeat) When a call is made to external code—that is, any C/C++ extension built-in function—the GIL will be locked until it has completed (because there are no Python bytecodes to count as the interval). Extension pro- grammers do have the ability to unlock the GIL, however, so as the Python developer, you shouldn’t have to worry about your Python code locking up in those situations. As an example, for any Python I/O-oriented routines (which invoke built-in operating system C code), the GIL is released before the I/O call is made, allowing other threads to run while the I/O is being performed. Code that doesn’t have much I/O will tend to keep the processor (and GIL)4.3 Threads and Python 161 for the full interval a thread is allowed before it yields. In other words, I/O-bound Python programs stand a much better chance of being able to take advantage of a multithreaded environment than CPU-bound code. Those of you who are interested in the source code, the interpreter main loop, and the GIL can take a look at the Python/ceval.c file. 4.3.2 Exiting Threads When a thread completes execution of the function it was created for, it exits. Threads can also quit by calling an exit function such as thread.exit(), or any of the standard ways of exiting a Python process such as sys.exit() or raising the SystemExit exception. You cannot, how- ever, go and “kill” a thread. We will discuss in detail the two Python modules related to threads in the next section, but of the two, the thread module is the one we do not recom- mend. There are many reasons for this, but an obvious one is that when the main thread exits, all other threads die without cleanup. The other module, threading, ensures that the whole process stays alive until all “important” child threads have exited. (For a clarification of what important means, read the upcoming Core Tip, “Avoid using the thread module.”) Main threads should always be good managers, though, and perform the task of knowing what needs to be executed by individual threads, what data or arguments each of the spawned threads requires, when they complete execution, and what results they provide. In so doing, those main threads can collate the individual results into a final, meaningful conclusion. 4.3.3 Accessing Threads from Python Python supports multithreaded programming, depending on the operating system on which it’s running. It is supported on most Unix-based platforms, such as Linux, Solaris, Mac OS X, BSD, as well as Windows-based PCs. Python uses POSIX-compliant threads, or pthreads, as they are commonly known. By default, threads are enabled when building Python from source (since Python 2.0) or the Win32 installed binary. To determine whether threads are available for your interpreter, simply attempt to import the thread module from the interactive interpreter, as shown here (no errors occur when threads are available): import thread 162 Chapter 4 • Multithreaded Programming If your Python interpreter was not compiled with threads enabled, the module import fails: import thread Traceback (innermost last): File "stdin", line 1, in ? ImportError: No module named thread In such cases, you might need to recompile your Python interpreter to get access to threads. This usually involves invoking the configure script with the with-thread option. Check the README file for your distribution to obtain specific instructions on how to compile Python with threads for your system. 4.3.4 Life Without Threads For our first set of examples, we are going to use the time.sleep() func- tion to show how threads work. time.sleep() takes a floating point argu- ment and “sleeps” for the given number of seconds, meaning that execution is temporarily halted for the amount of time specified. Let’s create two time loops: one that sleeps for 4 seconds (loop0()), and one that sleeps for 2 seconds (loop1()), respectively. (We use the names “loop0” and “loop1” as a hint that we will eventually have a sequence of loops.) If we were to execute loop0() and loop1() sequentially in a one- process or single-threaded program, as onethr.py does in Example 4-1, the total execution time would be at least 6 seconds. There might or might not be a 1-second gap between the starting of loop0() and loop1() as well as other execution overhead which can cause the overall time to be bumped to 7 seconds. Example 4-1 Loops Executed by a Single Thread (onethr.py) This script executes two loops consecutively in a single-threaded program. One loop must complete before the other can begin. The total elapsed time is the sum of times taken by each loop. 1 /usr/bin/env python 2 3 from time import sleep, ctime 4 5 def loop0(): 6 print 'start loop 0 at:', ctime() 7 sleep(4)4.3 Threads and Python 163 8 print 'loop 0 done at:', ctime() 9 10 def loop1(): 11 print 'start loop 1 at:', ctime() 12 sleep(2) 13 print 'loop 1 done at:', ctime() 14 15 def main(): 16 print 'starting at:', ctime() 17 loop0() 18 loop1() 19 print 'all DONE at:', ctime() 20 21 if __name__ == '__main__': 22 main() We can verify this by executing onethr.py, which renders the following output: onethr.py starting at: Sun Aug 13 05:03:34 2006 start loop 0 at: Sun Aug 13 05:03:34 2006 loop 0 done at: Sun Aug 13 05:03:38 2006 start loop 1 at: Sun Aug 13 05:03:38 2006 loop 1 done at: Sun Aug 13 05:03:40 2006 all DONE at: Sun Aug 13 05:03:40 2006 Now, assume that rather than sleeping, loop0() and loop1() were sepa- rate functions that performed individual and independent computations, all working to arrive at a common solution. Wouldn’t it be useful to have them run in parallel to cut down on the overall running time? That is the premise behind MT programming that we now introduce. 4.3.5 Python Threading Modules Python provides several modules to support MT programming, including the thread, threading, and Queue modules. Programmers can us the thread and threading modules to create and manage threads. The thread module provides basic thread and locking support; threading provides higher-level, fully-featured thread management. With the Queue module, users can create a queue data structure that can be shared across multiple threads. We will take a look at these modules individually and present examples and intermediate-sized applications.164 Chapter 4 • Multithreaded Programming CORE TIP: Avoid using the thread module We recommend using the high-level threading module instead of the thread module for many reasons. threading is more contemporary, has better thread support, and some attributes in the thread module can conflict with those in the threading module. Another reason is that the lower-level thread module has few synchronization primitives (actually only one) while threading has many. However, in the interest of learning Python and threading in general, we do present some code that uses the thread module. We present these for learning purposes only; hopefully they give you a much better insight as to why you would want to avoid using thread. We will also show you how to use more appropriate tools such as those available in the threading and Queue modules. Another reason to avoid using thread is because there is no control of when your process exits. When the main thread finishes, any other threads will also die, without warning or proper cleanup. As mentioned earlier, at least threading allows the important child threads to finish first before exiting. Use of the thread module is recommended only for experts desiring lower- level thread access. To emphasize this, it is renamed to _thread in Python 3. 3.x Any multithreaded application you create should utilize threading and per- haps other higher-level modules. 4.4 The thread Module Let’s take a look at what the thread module has to offer. In addition to being able to spawn threads, the thread module also provides a basic syn- chronization data structure called a lock object (a.k.a. primitive lock, simple lock, mutual exclusion lock, mutex, and binary semaphore). As we mentioned earlier, such synchronization primitives go hand in hand with thread management. Table 4-1 lists the more commonly used thread functions and LockType lock object methods.4.4 The thread Module 165 Table 4-1 thread Module and Lock Objects Function/Method Description thread Module Functions start_new_thread(function, Spawns a new thread and executes function args, kwargs=None) with the given args and optional kwargs allocate_lock() Allocates LockType lock object exit() Instructs a thread to exit LockType Lock Object Methods acquire(wait=None) Attempts to acquire lock object locked() Returns True if lock acquired, False otherwise release() Releases lock The key function of the thread module is start_new_thread(). It takes a function (object) plus arguments and optionally, keyword arguments. A new thread is spawned specifically to invoke the function. Let’s take our onethr.py example and integrate threading into it. By slightly changing the call to the loop() functions, we now present mtsleepA.py in Example 4-2: Example 4-2 Using the thread Module (mtsleepA.py) The same loops from onethr.py are executed, but this time using the simple multithreaded mechanism provided by the thread module. The two loops are executed concurrently (with the shorter one finishing first, obviously), and the total elapsed time is only as long as the slowest thread rather than the total time for each separately. 1 /usr/bin/env python 2 3 import thread 4 from time import sleep, ctime 5 6 def loop0(): 7 print 'start loop 0 at:', ctime() (Continued)166 Chapter 4 • Multithreaded Programming Example 4-2 Using the thread Module (mtsleepA.py) (Continued) 8 sleep(4) 9 print 'loop 0 done at:', ctime() 10 11 def loop1(): 12 print 'start loop 1 at:', ctime() 13 sleep(2) 14 print 'loop 1 done at:', ctime() 15 16 def main(): 17 print 'starting at:', ctime() 18 thread.start_new_thread(loop0, ()) 19 thread.start_new_thread(loop1, ()) 20 sleep(6) 21 print 'all DONE at:', ctime() 22 23 if __name__ == '__main__': 24 main() start_new_thread() requires the first two arguments, so that is the rea- son for passing in an empty tuple even if the executing function requires no arguments. Upon execution of this program, our output changes drastically. Rather than taking a full 6 or 7 seconds, our script now runs in 4 seconds, the length of time of our longest loop, plus any overhead. mtsleepA.py starting at: Sun Aug 13 05:04:50 2006 start loop 0 at: Sun Aug 13 05:04:50 2006 start loop 1 at: Sun Aug 13 05:04:50 2006 loop 1 done at: Sun Aug 13 05:04:52 2006 loop 0 done at: Sun Aug 13 05:04:54 2006 all DONE at: Sun Aug 13 05:04:56 2006 The pieces of code that sleep for 4 and 2 seconds now occur concur- rently, contributing to the lower overall runtime. You can even see how loop 1 finishes before loop 0. The only other major change to our application is the addition of the sleep(6) call. Why is this necessary? The reason is that if we did not stop the main thread from continuing, it would proceed to the next statement, displaying “all done” and exit, killing both threads running loop0() and loop1(). We did not have any code that directed the main thread to wait for the child threads to complete before continuing. This is what we mean by threads requiring some sort of synchronization. In our case, we used another sleep() call as our synchronization mechanism. We used a value4.4 The thread Module 167 of 6 seconds because we know that both threads (which take 4 and 2 sec- onds) should have completed by the time the main thread has counted to 6. You are probably thinking that there should be a better way of manag- ing threads than creating that extra delay of 6 seconds in the main thread. Because of this delay, the overall runtime is no better than in our single-threaded version. Using sleep() for thread synchronization as we did is not reliable. What if our loops had independent and varying exe- cution times? We could be exiting the main thread too early or too late. This is where locks come in. Making yet another update to our code to include locks as well as getting rid of separate loop functions, we get mtsleepB.py, which is presented in Example 4-3. Running it, we see that the output is similar to mtsleepA.py. The only difference is that we did not have to wait the extra time for mtsleepA.py to conclude. By using locks, we were able to exit as soon as both threads had completed execution. This renders the following output: mtsleepB.py starting at: Sun Aug 13 16:34:41 2006 start loop 0 at: Sun Aug 13 16:34:41 2006 start loop 1 at: Sun Aug 13 16:34:41 2006 loop 1 done at: Sun Aug 13 16:34:43 2006 loop 0 done at: Sun Aug 13 16:34:45 2006 all DONE at: Sun Aug 13 16:34:45 2006 Example 4-3 Using thread and Locks (mtsleepB.py) Rather than using a call to sleep() to hold up the main thread as in mtsleepA.py, the use of locks makes more sense. 1 /usr/bin/env python 2 3 import thread 4 from time import sleep, ctime 5 6 loops = 4,2 7 8 def loop(nloop, nsec, lock): 9 print 'start loop', nloop, 'at:', ctime() 10 sleep(nsec) 11 print 'loop', nloop, 'done at:', ctime() 12 lock.release() 13 (Continued)168 Chapter 4 • Multithreaded Programming Example 4-3 Using thread and Locks (mtsleepB.py) (Continued) 14 def main(): 15 print 'starting at:', ctime() 16 locks = 17 nloops = range(len(loops)) 18 19 for i in nloops: 20 lock = thread.allocate_lock() 21 lock.acquire() 22 locks.append(lock) 23 24 for i in nloops: 25 thread.start_new_thread(loop, 26 (i, loopsi, locksi)) 27 28 for i in nloops: 29 while locksi.locked(): pass 30 31 print 'all DONE at:', ctime() 32 33 if __name__ == '__main__': 34 main() So how did we accomplish our task with locks? Let’s take a look at the source code. Line-by-Line Explanation Lines 1–6 After the Unix startup line, we import the thread module and a few famil- iar attributes of the time module. Rather than hardcoding separate func- tions to count to 4 and 2 seconds, we use a single loop() function and place these constants in a list, loops. Lines 8–12 The loop() function acts as a proxy for the deleted loop() functions from our earlier examples. We had to make some cosmetic changes to loop() so that it can now perform its duties using locks. The obvious changes are that we need to be told which loop number we are as well as the sleep duration. The last piece of new information is the lock itself. Each thread will be allocated an acquired lock. When the sleep() time has concluded, we release the corresponding lock, indicating to the main thread that this thread has completed.4.5 The threading Module 169 Lines 14–34 The bulk of the work is done here in main(), using three separate for loops. We first create a list of locks, which we obtain by using the thread.allocate_lock() function and acquire (each lock) with the acquire() method. Acquiring a lock has the effect of “locking the lock.” Once it is locked, we add the lock to the lock list, locks. The next loop actually spawns the threads, invoking the loop() function per thread, and for each thread, provides it with the loop number, the sleep duration, and the acquired lock for that thread. So why didn’t we start the threads in the lock acquisition loop? There are two reasons. First, we wanted to synchro- nize the threads, so that all the horses started out the gate around the same time, and second, locks take a little bit of time to be acquired. If your thread executes too fast, it is possible that it completes before the lock has a chance to be acquired. It is up to each thread to unlock its lock object when it has completed execution. The final loop just sits and spins (pausing the main thread) until both locks have been released before continuing execution. Because we are checking each lock sequentially, we might be at the mercy of all the slower loops if they are more toward the beginning of the set of loops. In such cases, the majority of the wait time may be for the first loop(s). When that lock is released, remaining locks may have already been unlocked (meaning that corresponding threads have completed execution). The result is that the main thread will fly through those lock checks without pause. Finally, you should be well aware that the final pair of lines will execute main() only if we are invoking this script directly. As hinted in the earlier Core Note, we presented the thread module only to introduce the reader to threaded programming. Your MT applica- tion should use higher-level modules such as the threading module, which we discuss in the next section. 4.5 The threading Module We will now introduce the higher-level threading module, which gives you not only a Thread class but also a wide variety of synchronization mechanisms to use to your heart’s content. Table 4-2 presents a list of all the objects available in the threading module.170 Chapter 4 • Multithreaded Programming Table 4-2 threading Module Objects Object Description Thread Object that represents a single thread of execution Lock Primitive lock object (same lock as in thread module) RLock Re-entrant lock object provides ability for a single thread to (re)acquire an already-held lock (recursive locking) Condition Condition variable object causes one thread to wait until a certain “condition” has been satisfied by another thread, such as changing of state or of some data value Event General version of condition variables, whereby any number of threads are waiting for some event to occur and all will awaken when the event happens Semaphore Provides a “counter” of finite resources shared between threads; block when none are available BoundedSemaphore Similar to a Semaphore but ensures that it never exceeds its initial value Timer Similar to Thread, except that it waits for an allotted period of time before running a Barrier Creates a “barrier,” at which a specified number of threads must all arrive before they’re all allowed to continue a. New in Python 3.2. 3.2 In this section, we will examine how to use the Thread class to imple- ment threading. Because we have already covered the basics of locking, we will not cover the locking primitives here. The Thread() class also con- tains a form of synchronization, so explicit use of locking primitives is not necessary.4.5 The threading Module 171 CORE TIP: Daemon threads Another reason to avoid using the thread module is that it does not support the concept of daemon (or daemonic) threads. When the main thread exits, all child threads will be killed, regardless of whether they are doing work. The concept of daemon threads comes into play here if you do not desire this behavior. Support for daemon threads is available in the threading module, and here is how they work: a daemon is typically a server that waits for client requests to service. If there is no client work to be done, the daemon sits idle. If you set the daemon flag for a thread, you are basically saying that it is non-critical, and it is okay for the process to exit without waiting for it to finish. As you have seen in Chapter 2, “Network Programming,” server threads run in an infinite loop and do not exit in normal situations. If your main thread is ready to exit and you do not care to wait for the child threads to finish, then set their daemon flags. A value of true denotes a thread is not important or more likely, not doing anything but waiting for a client. To set a thread as daemonic, make this assignment: thread.daemon = True before you start the thread. (The old-style way of calling thread.setDaemon(True) is deprecated.) The same is true for checking on a thread’s daemonic status; just check that value (versus calling thread.isDaemon()). A new child thread inher- its its daemonic flag from its parent. The entire Python program (read as: the main thread) will stay alive until all non-daemonic threads have exited—in other words, when no active non-daemonic threads are left. 4.5.1 The Thread Class The Thread class of the threading module is your primary executive object. It has a variety of functions not available to the thread module. Table 4-3 presents a list of attributes and methods.172 Chapter 4 • Multithreaded Programming Table 4-3 Thread Object Attributes and Methods Attribute Description Thread object data attributes name The name of a thread. ident The identifier of a thread. daemon Boolean flag indicating whether a thread is daemonic. Thread object methods __init__(group=None, Instantiate a Thread object, taking target callable target=None, name=None, and any args or kwargs. A name or group can also args=(), kwargs=, be passed but the latter is unimplemented. A verbose=None, verbose flag is also accepted. Any daemon value c daemon=None) sets the thread.daemon attribute/flag. start() Begin thread execution. run() Method defining thread functionality (usually overridden by application writer in a subclass). join(timeout=None) Suspend until the started thread terminates; blocks unless timeout (in seconds) is given. a getName() Return name of thread. a setName(name) Set name of thread. b isAlive/is_alive() Boolean flag indicating whether thread is still running. c isDaemon() Return True if thread daemonic, False otherwise. c setDaemon(daemonic) Set the daemon flag to the given Boolean daemonic value (must be called before thread start(). a. Deprecated by setting (or getting) thread.name attribute or passed in during instantiation. b. CamelCase names deprecated and replaced starting in Python 2.6. c. is/setDaemon() deprecated by setting thread.daemon attribute; thread.daemon can also be set during instantiation via the optional daemon value—new in Python 3.3.4.5 The threading Module 173 There are a variety of ways by which you can create threads using the Thread class. We cover three of them here, all quite similar. Pick the one you feel most comfortable with, not to mention the most appropriate for your application and future scalability (we like the final choice the best): •Create Thread instance, passing in function •Create Thread instance, passing in callable class instance •Subclass Thread and create subclass instance You’ll discover that you will pick either the first or third option. The lat- ter is chosen when a more object-oriented interface is desired and the for- mer, otherwise. The second, honestly, is a bit more awkward and slightly harder to read, as you’ll discover. Create Thread Instance, Passing in Function In our first example, we will just instantiate Thread, passing in our func- tion (and its arguments) in a manner similar to our previous examples. This function is what will be executed when we direct the thread to begin execution. Taking our mtsleepB.py script from Example 4-3 and tweaking it by adding the use of Thread objects, we have mtsleepC.py, as shown in Example 4-4. Example 4-4 Using the threading Module (mtsleepC.py) The Thread class from the threading module has a join() method that lets the main thread wait for thread completion. 1 /usr/bin/env python 2 3 import threading 4 from time import sleep, ctime 5 6 loops = 4,2 7 8 def loop(nloop, nsec): 9 print 'start loop', nloop, 'at:', ctime() 10 sleep(nsec) 11 print 'loop', nloop, 'done at:', ctime() 12 13 def main(): 14 print 'starting at:', ctime() 15 threads = (Continued)174 Chapter 4 • Multithreaded Programming Example 4-4 Using the threading Module (mtsleepC.py) (Continued) 16 nloops = range(len(loops)) 17 18 for i in nloops: 19 t = threading.Thread(target=loop, 20 args=(i, loopsi)) 21 threads.append(t) 22 23 for i in nloops: start threads 24 threadsi.start() 25 26 for i in nloops: wait for all 27 threadsi.join() threads to finish 28 29 print 'all DONE at:', ctime() 30 31 if __name__ == '__main__': 32 main() When we run the script in Example 4-4, we see output similar to that of its predecessors: mtsleepC.py starting at: Sun Aug 13 18:16:38 2006 start loop 0 at: Sun Aug 13 18:16:38 2006 start loop 1 at: Sun Aug 13 18:16:38 2006 loop 1 done at: Sun Aug 13 18:16:40 2006 loop 0 done at: Sun Aug 13 18:16:42 2006 all DONE at: Sun Aug 13 18:16:42 2006 So what did change? Gone are the locks that we had to implement when using the thread module. Instead, we create a set of Thread objects. When each Thread is instantiated, we dutifully pass in the function (target) and arguments (args) and receive a Thread instance in return. The biggest dif- ference between instantiating Thread (calling Thread()) and invoking thread.start_new_thread() is that the new thread does not begin execu- tion right away. This is a useful synchronization feature, especially when you don’t want the threads to start immediately. Once all the threads have been allocated, we let them go off to the races by invoking each thread’s start() method, but not a moment before that. And rather than having to manage a set of locks (allocating, acquiring, releasing, checking lock state, etc.), we simply call the join() method for each thread. join() will wait until a thread terminates, or, if provided, a timeout occurs. Use of join() appears much cleaner than an infinite loop that waits for locks to be released (which is why these locks are sometimes known as spin locks).4.5 The threading Module 175 One other important aspect of join() is that it does not need to be called at all. Once threads are started, they will execute until their given function completes, at which point, they will exit. If your main thread has things to do other than wait for threads to complete (such as other process- ing or waiting for new client requests), it should do so. join() is useful only when you want to wait for thread completion. Create Thread Instance, Passing in Callable Class Instance A similar offshoot to passing in a function when creating a thread is hav- ing a callable class and passing in an instance for execution—this is the more object-oriented approach to MT programming. Such a callable class embodies an execution environment that is much more flexible than a function or choosing from a set of functions. You now have the power of a class object behind you, as opposed to a single function or a list/tuple of functions. Adding our new class ThreadFunc to the code and making other slight modifications to mtsleepC.py, we get mtsleepD.py, shown in Example 4-5. Example 4-5 Using Callable Classes (mtsleepD.py) In this example, we pass in a callable class (instance) as opposed to just a function. It presents more of an object-oriented approach than mtsleepC.py. 1 /usr/bin/env python 2 3 import threading 4 from time import sleep, ctime 5 6 loops = 4,2 7 8 class ThreadFunc(object): 9 10 def __init__(self, func, args, name=''): 11 self.name = name 12 self.func = func 13 self.args = args 14 15 def __call__(self): 16 self.func(self.args) 17 (Continued)

Advise: Why You Wasting Money in Costly SEO Tools, Use World's Best Free SEO Tool Ubersuggest.