Programming Python (112 page)

Read Programming Python Online

Authors: Mark Lutz

Tags: #COMPUTERS / Programming Languages / Python

BOOK: Programming Python
10.8Mb size Format: txt, pdf, ePub
FTP get and put Utilities

When I present
the
ftplib
interfaces
in Python classes, students often ask why programmers need to supply the
RETR string in the retrieval method. It’s a good
question—
the RETR string is the name of
the download command in the FTP protocol, but
ftplib
is supposed to encapsulate that
protocol. As we’ll see in a moment, we have to supply an arguably
odd
STOR string for uploads as well. It’s boilerplate code
that you accept on faith once you see it, but that begs the question.
You could propose a patch to
ftplib
,
but that’s not really a good answer for beginning Python students, and
it may break existing code (the interface is as it is for a
reason).

Perhaps a better answer is that Python makes it easy to extend the
standard library modules with higher-level interfaces of our own—with
just a few lines of reusable code, we can make the FTP interface look
any way we want in Python. For instance, we could, once and for all,
write utility modules that wrap the
ftplib
interfaces to hide the RETR string. If
we place these utility modules in a directory on
PYTHONPATH
, they become just as accessible as
ftplib
itself, automatically reusable
in any Python script we write in the future. Besides removing the RETR
string requirement, a wrapper module could also make assumptions that
simplify FTP operations into single function calls.

For instance, given a module that encapsulates and simplifies
ftplib
, our Python fetch-and-play
script could be further reduced to the script shown in
Example 13-3
—essentially just two
function calls plus a password prompt, but with a net effect exactly
like
Example 13-1
when
run.

Example 13-3. PP4E\Internet\Ftp\getone-modular.py

#!/usr/local/bin/python
"""
A Python script to download and play a media file by FTP.
Uses getfile.py, a utility module which encapsulates FTP step.
"""
import getfile
from getpass import getpass
filename = 'monkeys.jpg'
# fetch with utility
getfile.getfile(file=filename,
site='ftp.rmi.net',
dir ='.',
user=('lutz', getpass('Pswd?')),
refetch=True)
# rest is the same
if input('Open file?') in ['Y', 'y']:
from PP4E.System.Media.playfile import playfile
playfile(filename)

Besides having a much smaller line count, the meat of this script
has been split off into a file for reuse elsewhere. If you ever need to
download a file again, simply import an existing function instead of
copying code with cut-and-paste editing. Changes in download operations
would need to be made in only one file, not everywhere we’ve copied
boilerplate code;
getfile.getfile
could even be changed to use
urllib
rather than
ftplib
without affecting
any of its clients. It’s good engineering.

Download utility

So just how would we go about writing such an FTP interface
wrapper (he asks, rhetorically)? Given the
ftplib
library module, wrapping downloads of
a particular file in a particular directory is straightforward.
Connected FTP objects support two download methods:

retrbinary

This method
downloads the requested file in binary mode,
sending its bytes in chunks to a supplied function, without
line-feed mapping. Typically, the supplied function is a write
method of an open local file object, such that the bytes are
placed in the local file on the client.

retrlines

This method
downloads the requested file in ASCII text mode,
sending each line of text to a supplied function with all
end-of-line characters stripped. Typically, the supplied
function adds a
\n
newline
(mapped appropriately for the client machine), and writes the
line to a local file.

We will meet the
retrlines
method in a later example; the
getfile
utility module in
Example 13-4
always transfers in
binary mode with
retrbinary
. That
is, files are downloaded exactly as they were on the server, byte for
byte, with the server’s line-feed conventions in text files (you may
need to convert line feeds after downloads if they look odd in your
text editor—see your editor or system shell commands for pointers, or
write a Python script that opens and writes the text as
needed).

Example 13-4. PP4E\Internet\Ftp\getfile.py

#!/usr/local/bin/python
"""
Fetch an arbitrary file by FTP. Anonymous FTP unless you pass a
user=(name, pswd) tuple. Self-test FTPs a test file and site.
"""
from ftplib import FTP # socket-based FTP tools
from os.path import exists # file existence test
def getfile(file, site, dir, user=(), *, verbose=True, refetch=False):
"""
fetch a file by ftp from a site/directory
anonymous or real login, binary transfer
"""
if exists(file) and not refetch:
if verbose: print(file, 'already fetched')
else:
if verbose: print('Downloading', file)
local = open(file, 'wb') # local file of same name
try:
remote = FTP(site) # connect to FTP site
remote.login(*user) # anonymous=() or (name, pswd)
remote.cwd(dir)
remote.retrbinary('RETR ' + file, local.write, 1024)
remote.quit()
finally:
local.close() # close file no matter what
if verbose: print('Download done.') # caller handles exceptions
if __name__ == '__main__':
from getpass import getpass
file = 'monkeys.jpg'
dir = '.'
site = 'ftp.rmi.net'
user = ('lutz', getpass('Pswd?'))
getfile(file, site, dir, user)

This module is mostly just a repackaging of the FTP code we used
to fetch the image file earlier, to make it simpler and reusable.
Because it is a callable function, the exported
getfile.getfile
here tries to be as robust
and generally useful as possible, but even a function this small
implies some design decisions. Here are a few usage notes:

FTP mode

The
getfile
function in this script runs in anonymous FTP mode
by default, but a two-item tuple containing a username and
password string may be passed to the
user
argument in order to log in to
the remote server in nonanonymous mode. To use anonymous FTP,
either don’t pass the
user
argument or pass it an empty tuple,
()
. The FTP object
login
method allows two optional
arguments to denote a username and password, and the
function(*args)
call syntax in
Example 13-4
sends it
whatever argument tuple you pass to
user
as individual arguments.

Processing modes

If passed, the last two arguments (
verbose
,
refetch
) allow us to turn off status
messages printed to the
stdout
stream (perhaps undesirable in
a GUI context) and to force downloads to happen even if the file
already exists locally (the download overwrites the existing
local file).

These two arguments are coded as Python 3.X default
keyword-only arguments
, so if used they
must be passed by name, not position. The
user
argument instead can be passed
either way, if it is passed at all. Keyword-only arguments here
prevent passed verbose or refetch values from being incorrectly
matched against the
user
argument if the user value is actually omitted in a call.

Exception protocol

The caller is expected to handle exceptions; this function
wraps downloads in a
try
/
finally
statement to guarantee that
the local output file is closed, but it lets exceptions
propagate. If used in a GUI or run from a thread, for instance,
exceptions may require special handling unknown in this
file.

Self-test

If run standalone, this file downloads an image file again
from my website as a self-test (configure for your server and
file as desired), but the function will normally be passed FTP
filenames, site names, and directory names as well.

File mode

As in earlier examples, this script is careful to open the
local output file in
wb
binary mode to suppress end-line mapping and conform to Python
3.X’s Unicode string model. As we learned in
Chapter 4
, it’s not impossible that
true binary datafiles may have bytes whose value is equal to a
\n
line-feed character;
opening in
w
text mode
instead would make these bytes automatically expand to a
\r\n
two-byte sequence when
written locally on Windows. This is only an issue when run on
Windows; mode
w
doesn’t
change end-lines elsewhere.

As we also learned in
Chapter 4
, though, binary mode is
required to suppress the automatic
Unicode
translations
performed for text in Python 3.X.
Without binary mode, Python would attempt to encode fetched data
when written per a default or passed Unicode encoding scheme,
which might fail for some types of fetched text and would
normally fail for truly binary data such as images and
audio.

Because
retrbinary
writes
bytes
strings in 3.X,
we really cannot open the output file in text mode anyhow, or
write
will raise exceptions.
Recall that in 3.X text-mode files require
str
strings, and binary mode files
expect
bytes
. Since
retrbinary
writes
bytes
and
retrlines
writes
str
, they implicitly require binary
and text-mode output files, respectively. This constraint is
irrespective of end-line or Unicode issues, but it effectively
accomplishes those goals as well.

As we’ll see in later examples, text-mode retrievals have
additional encoding requirements; in fact,
ftplib
will turn out to be a good
example of the impacts of Python 3.X’s Unicode string model on
real-world code. By always using binary mode in the script here,
we sidestep the issue altogether.

Directory model

This function currently uses the same filename to identify
both the remote file and the local file where the download
should be stored. As such, it should be run in the directory
where you want the file to show up; use
os.chdir
to move to directories if
needed. (We could instead assume
filename
is the local file’s name, and strip the local directory with
os.path.split
to get the
remote name, or accept two distinct filename arguments—local and
remote.)

Also notice that, despite its name, this module is very
different from the
getfile.py
script we studied
at the end of the sockets material in the preceding chapter. The
socket-based
getfile
implemented
custom client and server-side logic to download a server file to a
client machine over raw sockets.

The new
getfile
here is a
client-side tool only. Instead of raw sockets, it uses the standard
FTP protocol to request a file from a server; all socket-level details
are hidden in the simpler
ftplib
module’s implementation of the FTP client protocol. Furthermore, the
server here is a perpetually running program on the server machine,
which listens for and responds to FTP requests on a socket, on the
dedicated FTP port (number 21). The net functional effect is that this
script requires an FTP server to be running on the machine where the
desired file lives, but such a server is much more likely to be
available.

Upload utility

While we’re at it, let’s write a script to upload a single file
by FTP to a remote machine. The upload interfaces in the FTP module
are symmetric with the download interfaces. Given a connected FTP
object, its:

  • storbinary
    method
    can be used to upload bytes from an open local file
    object

  • storlines
    method
    can be used to upload text in ASCII mode from an
    open local file object

Unlike the download interfaces, both of these methods are passed
a file object as a whole, not a file object method (or other
function). We will meet the
storlines
method in a later example. The
utility module in
Example 13-5
uses
storbinary
such that the file whose name is
passed in is always uploaded verbatim—in binary mode, without Unicode
encodings or line-feed translations for the target machine’s
conventions. If this script uploads a text file, it will arrive
exactly as stored on the machine it came from, with client line-feed
markers and existing Unicode encoding.

Example 13-5. PP4E\Internet\Ftp\putfile.py

#!/usr/local/bin/python
"""
Store an arbitrary file by FTP in binary mode. Uses anonymous
ftp unless you pass in a user=(name, pswd) tuple of arguments.
"""
import ftplib # socket-based FTP tools
def putfile(file, site, dir, user=(), *, verbose=True):
"""
store a file by ftp to a site/directory
anonymous or real login, binary transfer
"""
if verbose: print('Uploading', file)
local = open(file, 'rb') # local file of same name
remote = ftplib.FTP(site) # connect to FTP site
remote.login(*user) # anonymous or real login
remote.cwd(dir)
remote.storbinary('STOR ' + file, local, 1024)
remote.quit()
local.close()
if verbose: print('Upload done.')
if __name__ == '__main__':
site = 'ftp.rmi.net'
dir = '.'
import sys, getpass
pswd = getpass.getpass(site + ' pswd?') # filename on cmdline
putfile(sys.argv[1], site, dir, user=('lutz', pswd)) # nonanonymous login

Notice that for portability, the local file is opened in
rb
binary input mode this time to
suppress automatic line-feed character conversions. If this is binary
information, we don’t want any bytes that happen to have the value of
the
\r
carriage-return character to
mysteriously go away during the transfer when run on a Windows client.
We also want to suppress Unicode encodings for nontext files, and we
want reads to produce the
bytes
strings expected by the
storbinary
upload operation (more on input file modes later).

This script uploads a file you name on the command line as a
self-test, but you will normally pass in real remote filename, site
name, and directory name strings. Also like the download utility, you
may pass a
(username, password)
tuple to the
user
argument to
trigger nonanonymous FTP mode (anonymous FTP is the default).

Other books

Bride Of The Dragon by Georgette St. Clair
Plastic by Susan Freinkel
A Parachute in the Lime Tree by Annemarie Neary
Mistress of Dragons by Margaret Weis
Never Forget Me by Marguerite Kaye
Love or Money by Peter McAra
The God Squad by Doyle, Paddy