Programming Python (132 page)

Read Programming Python Online

Authors: Mark Lutz

Tags: #COMPUTERS / Programming Languages / Python

BOOK: Programming Python
12.77Mb size Format: txt, pdf, ePub
Running PyMailGUI

Of course,
to script PyMailGUI on your own, you’ll need to be able to
run it. PyMailGUI requires only a computer with some sort of Internet
connectivity (a PC with a broadband or dial-up account will do) and an
installed Python with the tkinter extension enabled. The Windows port of
Python has this capability, so Windows PC users should be able to run
this program immediately by clicking its icon.

Two notes on running the system: first, you’ll want to change the
file
mailconfig.py
in the program’s source
directory to reflect your account’s parameters, if you wish to send or
receive mail from a live server; more on this as we interact with the
system ahead.

Second, you can still experiment with the system without a live
Internet connection—for a quick look at message view windows, use the
main window’s Open buttons to open saved-mail files included in the
program’s
SavedMail
subdirectory. The PyDemos
launcher script at the top of the book’s examples directory, for
example, forces PyMailGUI to open saved-mail files by passing filenames
on the command line. Although you’ll probably want to connect to your
email servers eventually, viewing saved mails offline is enough to
sample the system’s flavor and does not require any configuration file
changes.

Presentation Strategy

PyMailGUI is
easily the largest program in this book, but it doesn’t
introduce many library interfaces that we haven’t already seen in this
book. For instance:

  • The PyMailGUI interface is built with Python’s tkinter, using
    the familiar listboxes, buttons, and text widgets we met
    earlier.

  • Python’s
    email
    package is
    applied to pull-out headers, text, and attachments of messages, and
    to compose the same.

  • Python’s POP and SMTP library modules are used to fetch, send,
    and delete mail over sockets.

  • Python threads, if installed in your Python interpreter, are
    put to work to avoid blocking during potentially overlapping,
    long-running mail operations.

We’re also going to reuse the PyEdit
TextEditor
object we wrote in
Chapter 11
to view and compose messages and to
pop up raw text, attachments, and source; the
mailtools
package’s tools we wrote in
Chapter 13
to load, send, and delete mail with a
server; and the
mailconfig
module
strategy introduced in
Chapter 13
to
support end-user settings. PyMailGUI is largely an exercise in combining
existing tools.

On the other hand, because this program is so long, we won’t
exhaustively document all of its code. Instead, we’ll begin with a quick
look at how PyMailGUI has evolved, and then move on to describing how it
works today from an end user’s perspective—a brief demo of its windows
in action. After that, we’ll list the system’s new source code modules
without many additional comments, for further study.

Like most of the longer case studies in this book, this section
assumes that you already know enough Python to make sense of the code on
your own. If you’ve been reading this book linearly, you should also
know enough about tkinter, threads, and mail interfaces to understand
the library tools applied here. If you get stuck, you may wish to brush
up on the presentation of these topics earlier in the book.

[
54
]
And remember: you would have to multiply these line counts
by a factor of four or more to get the equivalent in a language
like C or C++. If you’ve done much programming, you probably
recognize that the fact that we can implement a fairly
full-featured mail processing program in roughly 5,000 total lines
of program code speaks volumes about the power of the Python
language and its libraries. For comparison, the original 1.0
version of this program from the second edition of this book was
just 745 total lines in 3 new modules, but it also was very
limited—it did not support PyMailGUI 2.X’s attachments, thread
overlap, local mail files, and so on, and did not have the
Internationalization support or other features of this edition’s
PyMailGUI 3.X.

[
55
]
In fact, my ISP’s webmail send system went down the very day I
had to submit the third edition of this book to my publisher! No
worries—I fired up PyMailGUI and used it to send the book as
attachment files through a different server. In a sense, this book
submitted itself.

Major PyMailGUI Changes

Like the PyEdit text
editor of
Chapter 11
,
PyMailGUI serves as a good example of software evolution in action.
Because its revisions help document this system’s functionality, and
because this example is as much about software engineering as about Python
itself, let’s take a quick look at its recent changes.

New in Version 2.1 and 2.0 (Third Edition)

The 2.1 version of PyMailGUI presented in the third edition of the
book in early 2006 is still largely present and current in this fourth
edition in 2010. Version 2.1 added a handful of enhancements to version
2.0, and version 2.0 was a complete rewrite of the 1.0 version of the
second edition with a radically expanded feature set.

In fact, the second edition’s version 1.0 of this program written
in early 2000 was only some 685 total program lines long (515 lines for
the GUI main script and 170 lines in an email utilities module), not
counting related examples reused, and just 60 lines in its help text
module. Version 1.0 was really something of a prototype (if not toy),
written mostly to serve as a short book example.

Although it did not yet support Internationalized mail content or
other 3.0 extensions, in the third edition, PyMailGUI 2.1 became a much
more realistic and feature-rich program that could be used for
day-to-day email processing. It grew by nearly a factor of three to be
1,800 new program source lines (plus 1,700 program lines in related
modules reused, and 500 additional lines of help text). By comparison,
version 3.0 by itself grew only by some 30% to be 2,400 new program
source lines as described earlier (plus 2,500 lines in related modules,
and 1,700 lines of help text). Statistically minded readers: consult
file
linecounts-prior-version.xls
in PyMailGUI’s
media
subdirectory
for a line counts breakdown for version 2.1 by file.

In version 2.1, among PyMailGUI’s new weapons were (and still are)
these:

  • MIME multipart mails with attachments may be both viewed and
    composed.

  • Mail transfers are no longer blocking, and may overlap in
    time.

  • Mail may be saved and processed offline from a local
    file.

  • Message parts may now be opened automatically within the
    GUI.

  • Multiple messages may be selected for processing in list
    windows.

  • Initial downloads fetch mail headers only; full mails are
    fetched on request.

  • View window headers and list window columns are
    configurable.

  • Deletions are performed immediately, not delayed until program
    exit.

  • Most server transfers report their progress in the GUI.

  • Long lines are intelligently wrapped in viewed and quoted
    text.

  • Fonts and colors in list and view windows may be configured by
    the user.

  • Authenticating SMTP mail-send servers that require login are
    supported.

  • Sent messages are saved in a local file, which may be opened
    in the GUI.

  • View windows intelligently pick a main text part to be
    displayed.

  • Already fetched mail headers and full mails are cached for
    speed.

  • Date strings and addresses in composed mails are formatted
    properly.

  • View windows now have quick-access buttons for
    attachments/parts (2.1).

  • Inbox out-of-sync errors are detected on deletes, and on index
    and mail loads (2.1).

  • Save-mail file loads and deletes are threaded, to avoid pauses
    for large files (2.1).

The last three items on this list were added in version 2.1; the
rest were part of the 2.0 rewrite. Some of these changes were made
simple by growth in standard library tools (e.g., support for
attachments is straightforward with the new
email
package), but most represented changes
in PyMailGUI itself. There were also a few genuine fixes: addresses were
parsed more accurately, and date and time formats in sent mails became
standards conforming, because these tasks used new tools in the
email
package.

New in Version 3.0 (Fourth Edition)

PyMailGUI version 3.0, presented in this fourth edition of this
book, inherits all of 2.1’s upgrades described in the prior section and
adds many of its own. Changes are perhaps less dramatic in version 3.0,
though some address important usability issues, and they seem
collectively sufficient to justify assigning this version a new major
release number. Here’s a summary of what’s new this time around:

Python 3.X port

The code was updated to run under Python 3.X only; Python
2.X is no longer supported without code changes. Although some of
the task of porting to Python 3.X requires only minor coding
changes, other idiomatic implications are more far reaching.
Python 3.X’s new Unicode focus, for example, motivated much of the
Internationalization support in this version of PyMailGUI
(discussed ahead).

Layout improvements

View window forms are laid out with gridding instead of
packed column frames, for better appearance and platform
neutrality of email headers (see
Chapter 9
for more details on form
layout). In addition, list window toolbars are now arranged with
expanding separators for clarity; this effectively groups buttons
by their roles and scope. List windows are also larger when
initially opened to show more.

Text editor fix for Tk change

Both the embedded text editor and some text editor instances
popped up on demand are now forcibly updated before new text is
inserted, for accurate initial positioning at line 1. See PyEdit
in
Chapter 11
for more on this
requirement; it stems from a recent change (bug?) in either Tk or
tkinter.

Text editor upgrades inherited

Because the PyEdit program is reused in multiple roles here,
this version of PyMailGUI also acquires all its latest fixes by
proxy. Most prominently, these include a new Grep external files
search dialog and support for displaying, opening, and saving
Unicode text. See
Chapter 11
for
details.

Workaround for Python 3.1 bug on traceback
prints

In the obscure-but-all-too-typical category: the common
function in
Shared
Names.py
that prints traceback
details had to be changed to work correctly under Python 3.X. The
traceback
module’s
print_tb
function can no longer print a
stack trace to
sys.stdout
if
the calling program is spawned from another on Windows; it still
can as before if the caller was run normally from a shell prompt.
Since this function is called from the main thread on worker
thread exceptions, if allowed to fail any printed error kills the
GUI entirely when it is spawned from the gadget or demo
launchers.

To work around this, the function now catches exceptions
when
print_tb
is called and in
response runs it again with a real file instead of
sys.stdout
. This appears to be a Python
3.X regression, as the same code worked correctly in both contexts
in Python 2.5 and 2.6. Unlike some similar issues, it has nothing
to do with printing Unicode, as stack traces are all ASCII text.
Even more baffling, directly printing to stdout in the same
function works fine. Hey, if it were easy, they wouldn’t call it
“work.”

Bcc addresses added to envelope but header
omitted

Minor change: addresses entered in the user-selectable Bcc
header line of edit windows are included in the recipients list
(the “envelope”), but the Bcc header line itself is no longer
included in the message text sent. Otherwise, Bcc recipients might
be seen by some email readers and clients (including PyMailGUI),
which defeats most of this header’s purpose.

Avoiding parallel fetches of the same
mail

PyMailGUI loads only mail headers initially, and fetches a
mail’s full text later when needed for viewing and other
operations, allowing multiple fetches to overlap in time (they are
run in parallel threads). Though unlikely, it was not impossible
for a user to trigger a new fetch for a mail that was currently
being fetched, by selecting the mail again during its download
(clicking its list entry twice quickly sufficed to kick this off).
Although the message cache updates performed in the parallel fetch
threads appeared to be thread safe, this behavior seemed odd and
wasted time.

To do better, this version now keeps track of all fetches in
progress in the main thread, to avoid this overlap potential
entirely—a message fetch in progress disables all new fetch
requests that it is a part of, until its fetch completes. Multiple
overlapping fetches are still allowed, as long as their targets do
not intersect. A set is used to detect nondisjoint fetch requests.
Mails already fetched and cached are not subject to this check and
can always be selected irrespective of any fetches in
progress.

Multiple recipients separated in GUI by commas, not
semicolons

In the prior edition, “;” was used as the recipient
character, and addresses were naively split on “;” on a send. This
attempted to avoid conflicts with “,” commonly used in email
names. Replies dropped the name part if it contained a “;” when
extracting a To address, but it was not impossible that clashes
could still arise if a “;” appeared both as the separator and in
manually typed address’s name.

To improve, this edition uses “,” as the recipient
separator, and fully parses email address lists with the
email
package’s
getaddresses
and
parseaddr
tools, instead of splitting
naively. Because these tools fully parse the list’s content, “,”
characters embedded in email address name parts are not mistakenly
takes as address separators, and so do not clash. Servers and
clients generally expect “,” separators, too, so this works
naturally.

With this fix, commas can appear both as address separators
as well as embedded in address name components. For replies, this
is handled automatically: the To field is prefilled with the From
in the original message. For sends, the split happens
automatically in email tools for To, Cc, and Bcc headers fields
(the latter two are ignored if they contain just the initial “?”
when sent).

HTML help display

Help can now be displayed in text form in a GUI window, in
HTML form in a locally running web browser, or both. User settings
in the
mailconfig
module select
which form or forms to display. The HTML version is new; it uses a
simple
translation
of the
help text with added links to sections and external sites and
Python’s
webbrowser
module,
discussed earlier in this book, to open a browser. The text help
display is now redundant, but it is retained because the HTML
display currently lacks its ability to open source file
viewers.

Thread callback queue speedup

The global
thread queue dispatches GUI update callbacks much
faster now—up to 100 times per second, instead of the prior 10.
This is due both to checking more frequently (20 timer events per
second versus 10) and to dispatching more callbacks per timer
event (5 versus the prior 1). Depending on the interleaving of
queue puts and gets, this speeds up initial loads for larger
mailboxes by as much as an order of magnitude (factor of 10), at
some potential minor cost in CPU utilization. On my Windows 7
laptop, though, PyMailGUI still shows 0% CPU utilization in Task
Manager when idle.

I bumped up the queue’s speed to support an email account
having 4,800 inbox messages (actually, even more by the time I got
around to taking screenshots for this chapter). Without the
speedup, initial header loads for this account took 8 minutes to
work through the 4,800 progress callbacks (4800 ÷ 10 ÷ 60), even
though most reflected messages skipped immediately by the new mail
fetch limits (see the next item). With the speedup, the initial
load takes just 48 seconds—perhaps not ideal still, but this
initial headers load is normally performed only once per session,
and this policy strikes a balance between CPU resources and
responsiveness. This email account is an arguably pathological
case, of course, but most initial loads benefit from the faster
speed.

See
Chapter 10
’s
threadtools
for most of this change’s
code, as well as additional background details. We could
alternatively loop through all queued events on each timer event,
but this may block the GUI indefinitely if updates are queued
quickly.

Mail fetch limits

Since 2.1, PyMailGUI loads only mail headers initially, not
full mail text, and only loads newly arrived headers thereafter.
Depending on your Internet and server speeds, though, this may
still be impractical for very large inboxes (as mentioned, one of
mine currently has some 4,800 emails). To support such cases, a
new
mailconfig
setting can be
used to limit the number of headers (or full mails if TOP is
unsupported) fetched on loads.

Given this setting N, PyMailGUI fetches at most N of the
most recently arrived mails. Older mails outside this set are not
fetched from the server, but are displayed as empty/dummy emails
which are mostly inoperative (though they can generally still be
fetched on demand).

This feature is inherited from
mailtools
code in
Chapter 13
; see the
mailconfig
module ahead for the user
setting associated with it. Note that even with this fix, because
the
threadtools
queue system
used here dispatches GUI events such as progress updates only up
to 100 times per second, a 4,800 mail inbox still takes
48 seconds
to complete an initially
header load. The queue should either run faster still, or I should
delete an email once in a while!

HTML main text extraction
(prototype)

PyMailGUI is still somewhat plain-text biased, despite the
emergence of
HTML emails in recent years. When the main (or
only) text part of a mail is HTML, it is displayed in a popped-up
web browser. In the prior version, though, its HTML text was still
displayed in a PyEdit text editor component and was still quoted
for the main text of replies and forwards.

Because most people are not HTML parsers, this edition’s
version attempts to do better by extracting plain text from the
part’s HTML with a simple HTML parsing step. The extracted plain
text is then displayed in the mail view window and used as
original text in replies and forwards.

This HTML parser is at best a prototype and is largely
included to provide a first step that you can tailor for your
tastes and needs, but any result it produces is better than
showing raw HTML. If this fails to render the plain text well,
users can still fall back on viewing in the web browser and
cutting and pasting from there into replies and forwards. See also
the note about open source alternatives by this parser’s source
code later in this chapter; this is an already explored problem
domain
.

Reply copies all original recipients by
default

In this version, replies are really reply-to-all by
default—they automatically prefill the Cc header in the replies
composition window with all the original recipients of the
message. To do so, replies extract all addresses among both the
original To and Cc headers, and remove duplicates as well as the
new sender’s address by using set operations. The net effect is to
copy all other recipients on the reply. This is in addition to
replying to the sender by initializing To with the original
sender’s
address
.

This feature is intended to reflect common usage: email
circulated among groups. Since it might not always be desirable,
though, it can be disabled in
mailconfig
so that replies initialize
just To headers to reply to the original sender only. If enabled,
users may need to delete the Cc prefill if not wanted; if
disabled, users may need to insert Cc addresses manually instead.
Both cases seem equally likely. Moreover, it’s not impossible that
the original recipients include mail list names, aliases, or
spurious addresses that will be either incorrect or irrelevant
when the reply is sent. Like the Bcc prefill described in the next
item, the reply’s Cc initialization can be edited prior to sends
if needed, and disabled entirely if preferred. Also see the
suggested enhancements for this feature at the end of this
chapter—allowing this to be enabled or disabled in the GUI per
message might be a better approach.

Other upgrades: Bcc prefills, “Re” and “Fwd” case,
list size, duplicate recipients

In addition, there have been smaller enhancements
throughout. Among them: Bcc headers in edit windows are now
prefilled with the sender’s address as a convenience (a common
role for this header); Reply and Forward now ignore case when
determining if adding a “Re:” or “Fwd:” to the subject would be
redundant; mail list window width and height may now be configured
in
mailconfig
; duplicates are
removed from the recipient address list in
mailtools
on sends to avoid sending
anyone multiple copies of the same mail (e.g., if an address
appears in both To and Cc); and other minor improvements which I
won’t cover here. Look for “3.0” and “4E” in program comments here
and in the underlying
mailtools
package of
Chapter 13
to see other
specific code changes.

Unicode (Internationalization)
support

I’ve saved the most
significant PyMailGUI 3.0 upgrade for last: it now
supports Unicode encoding of fetched, saved, and sent mails, to
the extent allowed by the Python 3.1
email
package. Both text parts of
messages and message headers are decoded when displayed and
encoded when sent. Since this is too large a change to explain in
this format, the next section elaborates.

Other books

Shattered by C.J. Bishop
Underwater by Maayan Nahmani
Raced by K. Bromberg
Envy by Kathryn Harrison
Clouds In My Coffee by Andrea Smith
All or Nothing by Jesse Schenker