Programming Python (35 page)

Read Programming Python Online

Authors: Mark Lutz

Tags: #COMPUTERS / Programming Languages / Python

BOOK: Programming Python
5.48Mb size Format: txt, pdf, ePub
Starting Independent Programs

As we learned earlier,
independent programs generally communicate with
system-global tools such as sockets and the fifo files we studied
earlier. Although processes spawned by
multiprocessing
can leverage these tools, too,
their closer relationship affords them the host of additional IPC
communication devices provided by this
module
.

Like threads,
multiprocessing
is designed to run function calls in parallel, not to start entirely
separate programs directly. Spawned functions might use tools like
os.system
,
os.popen
, and
subprocess
to start a program if such an
operation might block the caller, but there’s otherwise often no point
in starting a process that just starts a program (
you might as well
start the program and
skip a step). In fact, on Windows,
multi
processing
today uses the same process
creation call as
subprocess
, so
there’s little point in starting two processes to run one.

It is, however, possible to start new programs in the child
processes spawned, using tools like the
os.exec*
calls we met earlier—by spawning a
process portably with
multiprocessing
and overlaying it with a new program this way, we start a new
independent program, and effectively work around the lack of the
os.fork
call in standard Windows
Python.

This generally assumes that the new program doesn’t require any
resources passed in by the
Process
API, of course (once a new program starts, it erases that which was
running), but it offers a portable equivalent to the fork/exec
combination on Unix. Furthermore, programs started this way can still
make use of more traditional IPC tools, such as sockets and fifos, we
met earlier in this chapter.
Example 5-33
illustrates the
technique.

Example 5-33. PP4E\System\Processes\multi5.py

"Use multiprocessing to start independent programs, os.fork or not"
import os
from multiprocessing import Process
def runprogram(arg):
os.execlp('python', 'python', 'child.py', str(arg))
if __name__ == '__main__':
for i in range(5):
Process(target=runprogram, args=(i,)).start()
print('parent exit')

This script starts 5 instances of the
child.py
script we wrote in
Example 5-4
as independent processes,
without waiting for them to finish. Here’s this script at work on
Windows, after deleting a superfluous system prompt that shows up
arbitrarily in the middle of its output (it runs the same on Cygwin, but
the output is not interleaved there):

C:\...\PP4E\System\Processes>
type child.py
import os, sys
print('Hello from child', os.getpid(), sys.argv[1])
C:\...\PP4E\System\Processes>
multi5.py
parent exit
Hello from child 9844 2
Hello from child 8696 4
Hello from child 1840 0
Hello from child 6724 1
Hello from child 9368 3

This technique isn’t possible with threads, because all threads
run in the same process; overlaying it with a new program would kill all
its threads. Though this is unlikely to be as fast as a fork/exec
combination on Unix, it at least provides similar and portable
functionality on Windows when required.

And Much More

Finally,
multiprocessing
provides
many more tools than these examples deploy, including
condition, event, and semaphore synchronization tools, and local and
remote managers that implement servers for shared object. For instance,
Example 5-34
demonstrates its
support for
pools
—spawned children that work in
concert on a given task.

Example 5-34. PP4E\System\Processes\multi6.py

"Plus much more: process pools, managers, locks, condition,..."
import os
from multiprocessing import Pool
def powers(x):
#print(os.getpid()) # enable to watch children
return 2 ** x
if __name__ == '__main__':
workers = Pool(processes=5)
results = workers.map(powers, [2]*100)
print(results[:16])
print(results[-2:])
results = workers.map(powers, range(100))
print(results[:16])
print(results[-2:])

When run, Python arranges to delegate portions of the task to
workers run in parallel:

C:\...\PP4E\System\Processes>
multi6.py
[4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4]
[4, 4]
[1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024, 2048, 4096, 8192, 16384, 32768]
[316912650057057350374175801344, 633825300114114700748351602688]
And a little less…

To be fair, besides such additional features and tools,
multiprocessing
also comes with additional
constraints beyond those we’ve already covered (pickleability, mutable
state, and so on). For example, consider the
following sort of code:

def action(arg1, arg2):
print(arg1, arg2)
if __name__ == '__main__':
Process(target=action, args=('spam', 'eggs')).start() # shell waits for child

This works as expected, but if we change the last line to the
following it fails on Windows because
lambdas
are
not pickleable (really, not importable):

Process(target=(lambda: action('spam', 'eggs'))).start()  # fails!-not pickleable

This precludes a common coding pattern that uses lambda to add
data to calls, which we’ll use often for callbacks in the GUI part of
this book. Moreover, this differs from the
threading
module that is the model for this
package—calls like the following which work for threads must be
translated to a callable and arguments:

threading.Thread(target=(lambda: action(2, 4))).start()   # but lambdas work here

Conversely, some behavior of the
threading
module is mimicked by
multiprocessing
, whether you wish it did or
not. Because programs using this package wait for child processes to
end by default, we must mark processes as
daemon
if we don’t want to block the shell
where the following sort of code is run (technically, parents attempt
to terminate daemonic children on exit, which means that the program
can exit when only daemonic children remain, much like
threading
):

def action(arg1, arg2):
print(arg1, arg2)
time.sleep(5) # normally prevents the parent from exiting
if __name__ == '__main__':
p = Process(target=action, args=('spam', 'eggs'))
p.daemon = True # don't wait for it
p.start()

There’s more on some of these issues in the Python library
manual; they are not show-stoppers by any stretch, but special cases
and potential pitfalls to some. We’ll revisit the lambda and daemon
issues in a more realistic context in
Chapter 8
, where we’ll use
multiprocessing
to launch GUI demos
independently.

Why multiprocessing? The Conclusion

As this section’s examples suggest,
multiprocessing
provides a powerful
alternative which aims to combine the portability and much of the
utility of threads with the fully parallel potential of processes and
offers additional solutions to IPC, exit status, and other parallel
processing goals.

Hopefully, this section has also given you a better understanding
of this module’s tradeoffs discussed at its beginning. In particular,
its separate process model precludes the freely shared mutable state of
threads, and bound methods and lambdas are prohibited by both the
pickleability requirements of its IPC pipes and queues, as well as its
process action implementation on Windows. Moreover, its requirement of
pickleability for process arguments on Windows also precludes it as an
option for conversing with clients in socket servers portably.

While not a replacement for threading in all applications, though,
multiprocessing
offers compelling
solutions for many. Especially for parallel-programming tasks which can
be designed to avoid its limitations, this module can offer both
performance and portability that Python’s more direct multitasking tools
cannot.

Unfortunately, beyond this brief introduction, we don’t have space
for a more complete treatment of this module in this book. For more
details, refer to the Python library manual. Here, we turn next to a
handful of additional program launching tools and a wrap up of this
chapter.

Other Ways to Start Programs

We’ve seen a
variety of ways to launch programs in this book so far—from
the
os.fork
/
exec
combination on Unix, to portable shell
command-line launchers like
os.system
,
os.popen
, and
subprocess
, to the portable
multiprocessing
module options of the last
section. There are still other ways to start programs in the Python
standard library, some of which are more platform neutral or obscure than
others. This section wraps up this chapter with a quick tour through this
set.

The os.spawn Calls

The
os.spawnv
and
os.spawnve
calls
were originally introduced to launch programs on Windows,
much like a
fork
/
exec
call combination on Unix-like platforms.
Today, these calls work on both Windows and Unix-like systems, and
additional variants have been added to parrot
os.exec
.

In recent versions of Python, the portable
subprocess
module has started to supersede
these calls. In fact, Python’s library manual includes a note stating
that this module has more powerful and equivalent tools and should be
preferred to
os.spawn
calls.
Moreover, the newer
multiprocessing
module can achieve similarly portable results today when combined with
os.exec
calls, as we saw earlier.
Still, the
os.spawn
calls continue to
work as advertised and may appear in Python code you encounter.

The
os.spawn
family of calls
execute a program named by a command line in a new process, on both
Windows and Unix-like systems. In basic operation, they are similar to
the
fork
/
exec
call combination on Unix and can be used
as alternatives to the
system
and
popen
calls we’ve already learned. In the
following interaction, for instance, we start a Python program with a
command line in two traditional ways (the second also reads its
output):

C:\...\PP4E\System\Processes>
python
>>>
print(open('makewords.py').read())
print('spam')
print('eggs')
print('ham')
>>>
import os
>>>
os.system('python makewords.py')
spam
eggs
ham
0
>>>
result = os.popen('python makewords.py').read()
>>>
print(result)
spam
eggs
ham

The equivalent
os.spawn
calls
achieve the same effect, with a slightly more complex call signature
that provides more control over the way the program is launched:

>>>
os.spawnv(os.P_WAIT, r'C:\Python31\python', ('python', 'makewords.py'))
spam
eggs
ham
0
>>>
os.spawnl(os.P_NOWAIT, r'C:\Python31\python', 'python', 'makewords.py')
1820
>>> spam
eggs
ham

The
spawn
calls are also much
like forking programs in Unix. They don’t actually copy the calling
process (so shared descriptor operations won’t work), but they can be
used to start a program running completely independent of the calling
program, even on Windows. The script in
Example 5-35
makes the similarity to
Unix programming patterns more obvious. It launches a program with a
fork
/
exec
combination on Unix-like platforms
(including Cygwin), or an
os.spawnv
call on Windows.

Example 5-35. PP4E\System\Processes\spawnv.py

"""
start up 10 copies of child.py running in parallel;
use spawnv to launch a program on Windows (like fork+exec);
P_OVERLAY replaces, P_DETACH makes child stdout go nowhere;
or use portable subprocess or multiprocessing options today!
"""
import os, sys
for i in range(10):
if sys.platform[:3] == 'win':
pypath = sys.executable
os.spawnv(os.P_NOWAIT, pypath, ('python', 'child.py', str(i)))
else:
pid = os.fork()
if pid != 0:
print('Process %d spawned' % pid)
else:
os.execlp('python', 'python', 'child.py', str(i))
print('Main process exiting.')

To make sense of these examples, you have to understand the
arguments being passed to the spawn calls. In this script, we call
os.spawnv
with a process mode flag,
the full directory path to the Python interpreter, and a tuple of
strings representing the shell command line with which to start a new
program. The path to the Python interpreter executable program running a
script is available as
sys.executable
. In general, the
process mode
flag is taken from these predefined
values:

os.P_NOWAIT
and
os.P_NOWAITO

The
spawn
functions will
return as soon as the new process has been created, with the
process ID as the return value. Available on Unix and
Windows.

os.P_WAIT

The
spawn
functions will
not return until the new process has run to completion and will
return the exit code of the process if the run is successful or
“-signal” if a signal kills the process. Available on Unix and
Windows.

os.P_DETACH
and
os.P_OVERLAY

P_DETACH
is similar to
P_NOWAIT
, but the new process
is detached from the console of the calling process. If
P_OVERLAY
is used, the current program
will be replaced (much like
os.exec
). Available on Windows.

In fact, there are eight different calls in the spawn family,
which all start a program but vary slightly in their call signatures. In
their names, an “l” means you list arguments individually, “p” means the
executable file is looked up on the system path, and “e” means a
dictionary is passed in to provide the shelled environment of the
spawned program: the
os.spawnve
call,
for example, works the same way as
os.spawnv
but accepts an extra fourth
dictionary argument to specify a different shell environment for the
spawned program (which, by default, inherits all of the parent’s
settings):

os.spawnl(mode, path, ...)
os.spawnle(mode, path, ..., env)
os.spawnlp(mode, file, ...) # Unix only
os.spawnlpe(mode, file, ..., env) # Unix only
os.spawnv(mode, path, args)
os.spawnve(mode, path, args, env)
os.spawnvp(mode, file, args) # Unix only
os.spawnvpe(mode, file, args, env) # Unix only

Because these calls mimic the names and call signatures of the
os.exec
variants, see earlier in this
chapter for more details on the differences between these call forms.
Unlike the
os.exec
calls, only half
of the
os.spawn
forms—those without
system path checking (and hence without a “p” in their names)—are
currently implemented on Windows. All the process mode flags are
supported on Windows, but detach and overlay modes are not available on
Unix. Because this sort of detail may be prone to change, to verify
which are present, be sure to see the library manual or run a
dir
built-in function call on the
os
module after an import.

Here is the script in
Example 5-35
at work on Windows,
spawning 10 independent copies of the
child.py
Python program we met earlier in this chapter:

C:\...\PP4E\System\Processes>
type child.py
import os, sys
print('Hello from child', os.getpid(), sys.argv[1])
C:\...\PP4E\System\Processes>
python spawnv.py
Hello from child −583587 0
Hello from child −558199 2
Hello from child −586755 1
Hello from child −562171 3
Main process exiting.
Hello from child −581867 6
Hello from child −588651 5
Hello from child −568247 4
Hello from child −563527 7
Hello from child −543163 9
Hello from child −587083 8

Notice that the copies print their output in random order, and the
parent program exits before all children do; all of these programs are
really running in parallel on Windows. Also observe that the child
program’s output shows up in the console box where
spawnv.py
was run; when using
P_NOWAIT
, standard output comes to the
parent’s console, but it seems to go nowhere when using
P_DETACH
(which is most likely a feature when
spawning GUI programs).

But having shown you this call, I need to again point out that
both the
subprocess
and
multiprocessing
modules offer more portable
alternatives for spawning programs with command lines today. In fact,
unless
os.spawn
calls provide unique
behavior you can’t live without (e.g., control of shell window pop ups
on Windows), the platform-specific alternatives code of
Example 5-35
can be replaced altogether
with the
portable
multiprocessing
code in
Example 5-33
.

The os.startfile call on Windows

Although
os.spawn
calls may be
largely superfluous today, there are other tools that can still make a
strong case for themselves. For instance, the
os.system
call can be used on Windows to
launch a DOS
start
command, which
opens (i.e., runs) a file independently based on its Windows filename
associations, as though it were clicked.
os.startfile
makes this even simpler in recent
Python releases, and it can avoid blocking its caller, unlike some other
tools.

Using the DOS start command

To understand why,
first you need to know how the DOS start command works
in general. Roughly, a DOS command line of the form
start
command
works as if
command
were typed in the
Windows Run dialog box available in the Start button menu. If
command
is a filename, it is opened exactly
as if its name was double-clicked in the Windows Explorer file
selector GUI.

For instance, the following three DOS commands automatically
start Internet Explorer, my registered image viewer program, and my
sound media player program on the files named in the commands. Windows
simply opens the file with whatever program is associated to handle
filenames of that form. Moreover, all three of these programs run
independently of the DOS console box where the command is
typed:

C:\...\PP4E\System\Media>
start lp4e-preface-preview.html
C:\...\PP4E\System\Media>
start ora-lp4e.jpg
C:\...\PP4E\System\Media>
start sousa.au

Because the
start
command can
run any file and command line, there is no reason it cannot also be
used to start an independently running Python program:

C:\...\PP4E\System\Processes>
start child.py 1

This works because Python is registered to open names ending in
.py
when it is installed. The script
child.py
is launched independently of the DOS
console window even though we didn’t provide the name or path of the
Python interpreter program. Because
child.py
simply prints a message and exits, though, the result isn’t exactly
satisfying: a new DOS window pops up to serve as the script’s standard
output, and it immediately goes away when the child exits. To do
better, add an
input
call at the
bottom of the program file to wait for a key press before
exiting:

C:\...\PP4E\System\Processes>
type child-wait.py
import os, sys
print('Hello from child', os.getpid(), sys.argv[1])
input("Press ") # don't flash on Windows
C:\...\PP4E\System\Processes>
start child-wait.py 2

Now the child’s DOS window pops up and stays up after the
start
command has returned.
Pressing the Enter key in the pop-up DOS window makes it go
away.

Using start in Python scripts

Since we know that
Python’s
os.system
and
os.popen
can be called by a
script to run
any
command line that can be typed
at a DOS shell prompt, we can also start independently running
programs from a Python script by simply running a DOS
start
command line. For instance:

C:\...\PP4E\System\Media>
python
>>>
import os
>>>
cmd = 'start lp4e-preface-preview.html'
# start IE browser
>>>
os.system(cmd)
# runs independent
0

The Python
os.system
calls
here start whatever web page browser is registered on your machine to
open
.html
files (unless these programs are
already running). The launched programs run completely independent of
the Python session—when running a DOS start command,
os.system
does not wait for the spawned
program to exit.

The os.startfile call

In fact,
start
is
so useful that recent Python releases also include an
os.startfile
call, which is
essentially the same as spawning a DOS start command with
os.system
and works as though the named file
were double-clicked. The following calls, for instance, have a similar
effect:

>>>
os.startfile('lp-code-readme.txt')
>>>
os.system('start lp-code-readme.txt')

Both pop up the text file in Notepad on my Windows computer.
Unlike the second of these calls, though,
os.startfile
provides no option to wait for
the application to close (the DOS
start
command’s
/WAIT
option does) and no way to retrieve
the application’s exit status (returned from
os.system
).

On recent versions of Windows, the following has a similar
effect, too, because the registry is used at the command line (though
this form pauses until the file’s viewer is closed—like using
start /WAIT
):

>>>
os.system('lp-code-readme.txt')
# 'start' is optional today

This is a convenient way to open arbitrary document and media
files, but keep in mind that the
os.startfile
call works only on Windows,
because it uses the Windows registry to know how to open a file. In
fact, there are even more obscure and nonportable ways to launch
programs, including Windows-specific options in the PyWin32 package,
which we’ll finesse here. If you want to be more platform neutral,
consider using one of the other many program launcher tools we’ve
seen, such as
os.popen
or
os.spawnv
. Or better yet, write a module to
hide the details—as the next and final
section demonstrates.

Other books

His Seduction Game Plan by Katherine Garbera
The Wolf's Pursuit by Rachel Van Dyken
When Tony Met Adam (Short Story) by Brockmann, Suzanne
A Working Theory of Love by Scott Hutchins
Helix: Plague of Ghouls by Pat Flewwelling
Faces by Matthew Farrer
Great Bitten: Outbreak by Fielding, Warren