Archive for the 'Python' category

 | March 19, 2012 9:36 pm

XML Parsing with Python, Part 4In Part 2 of this tutorial, we looked at how libxml (through the use of the lxml Python bindings) can be used to parse XML documents. We covered how to load XML files from disk, parse them, look for specific tags and attributes, and generate custom datatypes from the results.

These are the operations that lxml is used most often for and are probably what most developers will need it for. It only scratches the surface of what it is capable of, however.

In Part 2, we talked about three common XML operations: parsing, validation, and transformation. Of these three, parsing is the most important. We touched upon validation and transformation, but didn’t look at how they work.

In this video, we’ll take a look at the last two operation. We’ll cover validation using DocType definitions and XML Schema and transformation, using XSLT style sheets.

Note: The example files used in the video can be downloaded from here.

Show me more… »

 | 1:59 pm

Parsing XML with Python, Part 2As we talked about in Part 1 of this series, XML is one of the most versatile structural languages, which has led it to be used in nearly everything every sort of application imaginable. It serves as the basis of file formats, a way to store financial records, and for communication between web servers. For this reason, programmers and designers should be familiar with not only how to write it, but also how to parse, validate, and transform it.

One of the most powerful tools for working with XML is libxml, an open source parser. Given its versatility, libxml is used in many open source projects, and even commercial products. It can be found at the base of the Gnome desktop and in banking software.

In this video, Part 2 of a series on working with XML from Python, we’ll look at how you can use libxml from Python through the lxml bindings. We’ll show how to load and parse an XML document, iterate through the elements and attributes, test for specific tags, and translate the document structure into a custom object.

Note: The example files used in the video can be downloaded from here.

Show me more… »

 | March 16, 2012 3:01 pm

Parsing XML with Python (Part 1, Installation)XML is one of the most versatile structural languages around. It is used extensively for file formats, storing miscellaneous data, and communicating between different programs. You can find it on the web, on the desktop, and places you might not expect, such as in interactive books.

Given its importance, most programming languages come with a parser that can be used to analyze and decode XML. Examples include MSXML, which is used in Microsoft’s .NET framework and libxml, a popular open source library.

In this series of videos, I’ll show you how to get up and running with libxml, through the lxml Python bindings. We’ll look at how to install the programs (Python, the Python package manager, setuptools, and lxml itself), read and parse an XML document from a file, inspect tags and attributes, validate XML structure, and perform XML transforms.

This first video will focus on the installation of Python, setuptools, and lxml.

Show me more… »

 | March 8, 2012 12:15 am

For many writers, the act of writing (or placing one word after another) is synonymous with the tool that they use to do it. There’s a reason why writers feel so strongly about their moleskin notebooks, fountain pens, and computer software. I’m no different than any other writer. I have my preferred tools, and I love them dearly. They help to focus on my ideas and craft prose that I can be proud of.

When writing on a computer, the tool of choice for many writers (dare I say most?) is Microsoft Word. It’s everywhere and everyone has used it. It comes preinstalled on most computers and is a de-facto standard for exchanging written material with others.

Unfortunately, Word is not part of my preferred toolset. I prefer to write using LyX and an add-on I’ve written for it called LyX-Outline. But while I love my writing program, it makes it difficult to collaborate with other writers who use Word, as LyX doesn’t have a straightforward way to directly import Word files.

This isn’t a new problem and I’ve written about it before. I’ve even proposed a solutions. But while that solution was a good fit for me, it isn’t something that I would recommend to others.

For starters, it required a great deal of software to be installed. You needed a program to convert Microsoft Word documents to Open Office documents. You then had to use a second utility to convert it to HTML or LaTeX. After that, you used to a third utility to clean it up and import the LaTeX code into LyX. Three distinct steps, with a lot of places where things could go wrong.

Over the past few months, I’ve found that I need a better way, a tool that can directly import a Word document and cleanly translate its content. So, I decided to create one.

MSDoc2LyX

Show me more… »

 | September 20, 2011 8:46 pm

Note: Whenever I come across an interesting or potentially useful piece of source code, I like to make a copy and file it away. I post some of them here in the hope that they will be useful to others as well.

In my recent coding adventures, I needed to create a function to monitor the up-time of one of my Linux servers. A very simple thing (you would think), and yet, I wasn’t able to find a function in the standard library which would give me this piece of information. So … I decided to write my own.

The script below makes use of the “uptime” command (common in many Unix operating systems) to see how long the machine has been turned on. It works on both Linux and OS X computers. If I have time, I will add code for Windows as well.

FORMAT_SEC = 1
FORMAT_MIN = 2
FORMAT_HOUR = 3
FORMAT_DAY = 4

def uptime(format = FORMAT_SEC):
  """Determine how long the system has been running. Options:
  FORMAT_SEC returns uptime in seconds, FORMAT_MIN returns minutes,
  FORMAT_HOUR returns hours, FORMAT_DAY returns uptime in days."""

  def checkuptime():
    """OS specific ways of checking and returning
    the system uptime."""

    # For Linux and Mac, use "uptime" command and parse results
    if (sys.platform == 'linux2') | (sys.platform == 'darwin'):
      raw = subprocess.Popen('uptime',
        stdout=subprocess.PIPE).communicate()[0].replace(',','')
      up_days = int(raw.split()[2])

      if 'min' in raw:
        up_hours = 0
        up_min = int(raw[4])
      else:
        up_hours, up_min = map(int,raw.split()[4].split(':'))

      # Calculate the total seconds that the system has been working
      up_sec = up_days*24*60*60 + up_hours*60*60 + up_min*60
      return up_sec

  total_sec = checkuptime()

  # Get the days, hours, etc.
  if format == FORMAT_SEC: # Return seconds
    return total_sec
  if format == FORMAT_MIN: # Return minutes
    minutes = (total_sec)/(60)
    return minutes

  if format == FORMAT_HOUR: # Return hours
    hours = (total_sec)/(60*60)
    return hours

  if format == FORMAT_DAY: # Return days
    days = (total_sec)/(60*60*24)
    return days

  return -1 # If invalid option specified, return false.

 | 8:30 pm

Note: Whenever I come across an interesting or potentially useful piece of source code, I like to make a copy and file it away. I post some of them here in the hope that they will be useful to others as well.

For the past couple of days, I have been working on scripts to automate the administration of some of my computers. Because they’ve been running  big jobs that often take hours to complete, I’ve been leaving them unattended. But since I need to know as soon as the job is completed, it’s nice to receive some sort of alert. For that reason, I wrote a simple script that will send an email when the job is finished. It uses the string and smtplib modules from the standard library. Here’s the code:

import string, smtplib

# Email Settings
HOST = "smtp.gmail.com" # SMTP Server to Send Message From
PORT = 587 # Port you wish to communicate on
USER = "username@gmail.com" # Email Username
PASSWORD = "PASSWORD" # Email Password

FROM = "admin@oak-tree.us"
TO = "recipient@domain.com"

SUBJECT = "Server Notification from " + AMAZON_EC2
TEXT = "Server " + AMAZON_EC2 + " is shutting down.\n" + 'Timestamp: ' +
    str(CURRENT_TIME)

def sendmessage():
  """Send email notification to the specified email address."""
  server = smtplib.SMTP(HOST, PORT)
  server.ehlo()
  server.starttls() # Use Encryption
  server.ehlo
  server.login(USER, PASSWORD)
  BODY = string.join((
    "From: %s" % FROM,
    "To: %s" % TO,
    "Subject: %s" % SUBJECT ,
    "",
    TEXT
    ), "\r\n")
  server.sendmail(FROM, TO, BODY)
  server.quit()

In general, it’s pretty straightforward. You specify what host you want to connect to and the port, then you construct your message. Finally, you send it. In this example, I’ve added a few other things, like a line to log-on to the server and encryption.The actual connection and sending of the email is only two lines of code. The rest is setting up the message.

Pretty cool, huh?

 | May 27, 2010 1:28 pm

Note: After nearly a year of compatibility problems, Nokia and Apple have released versions of Qt and Snow Leopard (respectively) that are able to run PyQt.  This article explains how to install PyQt using the precompiled binaries available from the Nokia website and the PyQt bindings from Riverbank computing.  Installation instructions for Leopard can be found here.

To install PyQt on Mac OS X Snow Leopard is a four step process.  First, download and install the GCC compilers and associated utilities available in Apple’s XCode development environment.  Second, install the Qt framework from Nokia.  Third, download and compile the SIP binding generators from Riverbank computing.  Finally, compile and install the PyQt bindings themselves.

XCode and Qt

The easiest way to get the most current version of the GCC compilers and associated tools is to download the XCode package from Apple at the Apple Developer Connection (ADC).  ADC requires that you have an Apple ID and account, which are both free.  After you log-in, download the XCode developer tools package.  Given its size (nearly 1 GB), this may take awhile.

While waiting for the XCode tools to download, you can also begin downloading the Qt binary installer from the Trolltech downloads page.  As of this writing (May 26, 2010), the only version of Qt that runs well on Snow Leopard is version 4.7.  For that reason, I strongly suggest downloading the Qt 4.7.0 beta.  (Do not download the Qt 4.7.0 Carbon version, as Apple is dropping support for Carbon in the near future.  It is also believed that many of the previous compatibility issues were due to the deprecation of the Carbon frameworks.)

Once XCode has finished downloading, run the package installer.  It will ask you to select the destination drive and what components of the software you would like to install.

Select the installation disc where you want to install XCode the developer tools

Choose the components you would like to install

This will automatically install and configure the GNU tool chain.  Once it has finished, you can access the tools by going to the “/Developer/” folder on the root of the drive.

After XCode has finished, run the Qt installer.

The Qt 4.7 Installer

Select the installation destination

The Qt installer package will automatically place the tools and other things that you will need in the same Developer folder where the XCode tools and frameworks are located.  This includes the Qt developer tools, documentation examples and header files.  If you only wish to run Qt programs, it is not necessary to install the documentation and examples and you can deselect them by clicking on the “Customize” button of the “Installation Type” panel.

Qt4.7 Install - Customize

SIP and PyQ

SIP is a program that makes it easy to create C++ bindings for Python.  You can find the source code at Riverbank’s SIP download page.  Since you will be using the 4.10 (or greater) of PyQt, you should probably download the latest developer snapshot (version 4.10 or greater).  Be sure to get the Linux/Unix source code, rather than the Windows source.  You will also need the latest source code snapshot for PyQt 4.7, which is also available from Riverbank.

After you have finished downloading the source files, move them to a folder in your Users directory.  I have a special directory entitled “Applications” where I keep the source code for programs that I have manually compiled.  Note: The rest of the steps will be done from the Mac OS X terminal.

After you have moved the source code for both SIP and PyQt to this new directory, extract it by using the tar utility with the x and f options (tar –xf):

cd ~/Applications
tar -xf sip-snapshot-4.10-529660fb77a9.tar.gz
tar -xf PyQt-mac-gpl-snapshot-4.7.4-049020fd7d72b.tar.gz

After you have expanded the files, it might be a good idea to rename the directories to something more manageable (like sip-4.10 and PyQt4.7):

mv sip-snapshot-4.10-529660fb77a9.tar.gz sip-4.10
mv PyQt-mac-gpl-snapshot-4.7.4-049020fd7d72b.tar.gz PyQt4.7

First, we need to compile and install SIP.  The default configuration will move the compiled files to a directory where Leopard can’t read them.  So, we will manually need to specify the destination directory using the –d flag:

cd sip-4.8
python configure.py –d /Library/Python/2.6/site-packages

After the configuration is finished, run make and sudo make install:

make
sudo make install

Once SIP has finished installing, we need to repeat the process for PyQt.  From the sip-4.10 directory, change to PyQt:

cd ..
cd PyQt4.7

Next, run the configuration script specifying the path to the installation directory for the python bindings:

python configure.py –d /Library/Python/2.6/site-packages/

Then compile and install:

make
sudo make install

Since Qt is a rather large framework, it may require between 15 and 30 minutes to fully compile.

 | May 12, 2009 5:16 pm

One of computer programmers many fantasies (or at least one of my fantasies) is a cross-platform codebase that is easy to maintain.  I want to write an application once and then be able to deploy in on any platform in existence.  It should integrate into the destination environment in a native manner, but still be similar enough to the other platforms so that  issues can be addressed in a way that doesn’t hinder overall progress.

Part of this dream can be fulfilled by using Python and standard library, it is also possible to do much of it by using IronPython and .Net (through the OpenSource Mono framework).  However, in each case, you are forced to make a tradeoff.  Python and the standard library are wonderful, if you are willing to accept a speed hit.  IronPython and Mono also work well, as long as you avoid using any of the Windows specific libraries.  Unfortunately, many of neatest .Net features (like Windows Presentation Foundation) are only available on Windows.

Okay, there is always GTK+ or wxWidgets (the same toolkits used for the Gnome desktop), but GTK+ is written in C and while much of the miscellaneous nastiness has been abstracted away, it is still one of the hardest toolkits to work with.  Oh, and the documentation sucks.  (When any documentation can be found at all.)  Thus, we arrive at the final contender: Qt.  Qt has had a long and controversial history  involving license issues, passion, and miscellaneous geek pride.  In fact, it’s due to Qt and some rather divisive decisions made by its developers that we even have alternative toolkits at all.  But, despite that history, Trolltech’s recent decision to release the framework under the permissive LGPL license (rather than the noxious GPL) changes all that.  In many ways, Qt might just become the cross-platform toolkit of choice

Of course, it’s a given that Qt is tremendously powerful with rich functionality.  Applications that are written in Qt are fully cross-platform and behave the same way wherever they are deployed.  It uses native widgets whenever possible and has a sophisticated theming system that allows it to follow platform conventions without significantly modifying the underlying code.  More importantly, Qt is backed by a passionate developer community and a well funded corporation (Nokia).  It is true, you get what you pay for.  Qt is well designed and the documentation is simply fantastic.  And unlike GTK+ or wxWidgets, the entire toolkit is written in proper object oriented C++.

I first started experimenting with the Qt framework about two months ago, after deciding to tackle a major add-on to one of my favorite writing programs, LyX.  As I have said elsewhere, I am not the world’s most

gifted computer programmer (far from it actually), but Qt actually makes it pretty easy to get yourself up and running in a reasonable amount of time.  If I were forced to do most of my GUI programming in Qt and C++, that would be an acceptable solution.  However, that is completely unnecessary.  You see, Qt has one of the most tightly integrated Python bindings available, through the user of PyQt from Riverbank software.  Qt plus Python is, quite simply, programming perfection.  So while I did some mild experimenting with the Qt frameworks in C++, I have spent much of the past few months doing the heavy work of learning the framework via the use of  Python.

In the process, I have made a great many mistakes and a great many more discoveries.  And since the final step in any discovery is to share your findings, I will be doing that through this series of articles.  Of more interest to most, however, I will also be introducing the fruits of my labors: a somewhat functional prototype of my outliner add-on for LyX.

Show me more… »

 | 5:12 pm

Note: The installation instructions in this article apply to Mac OS X 10.5 (Leopard) and Qt 4.5.3.  While it is possible to get Qt 4.5.3 to run on Snow Leopard, it requires compiling it from source.  Instructions for doing so can be found in the comments below.  It is important to note, however, that there are some minor issues with the 4.5 framework on Snow Leopard (including mutlithreaded memory leaks).

You can find instructions for how to install PyQt on Snow Leopard using the Qt 4.7 frameworks at http://blog.oak-tree.us/index.php/2010/05/27/pyqt-snow-leopard.

To install PyQt on Mac OS X, it is also necessary to install and configure the GCC compilers and the Qt framework in addition to SIP and PyQt.  The easiest way to get the most current version of the GCC compilers and tools is by downloading the XCode tools package from Apple at the Apple Developer Connection (ADC).  ADC requires that you have an Apple ID and account, which are both free.  After you log-in, download the XCode developer tools package, which is a bit hefty (nearly 1 GB in size) and may take a while to download depending on your internet speed.

XCode and Qt

While waiting for the XCode tools to download, you can also begin downloading the Qt binary installer from the Trolltech downloads site.  As with Windows, I would advise that you you select  the complete SDK (which also include pre-compiled binaries) rather than the source release.  If you only download the Qt framework, you will need to compile the source yourself which is both time intensive and is not worth the additional headache.

Once XCode has finished downloading, run the package installer.  It will ask you to select the destination drive and what components of the software you would like to install.

Select the installation disc where you want to install XCode the developer tools

Choose the components you would like to install

This will automatically install and configure the GNU tool chain.  Once it has finished, you can access the tools by going to the “/Developer/” folder on the root of the drive.

After XCode has finished, run the Qt installer.

The Qt SKD Installer

Select the installation destination

The Qt installer package will automatically place the tools and other things that you will need in the same Developer folder where the XCode tools and frameworks are located.  You will need to make a note of the path for qmake.  The default installation site for qmake is “/usr/bin/qmake-4.5”.

SIP and PyQt

SIP is a program that makes it easy to create C++ bindings program for Python.  You can find the source code at Riverbank’s SIP download page.  Since we will be using the 4.5 version of PyQt, you will need to download the latest developer snapshot (version 4.8 or greater).  Be sure to get the Linux/Unix source code, rather than the Windows source.  You will also need the latest source code snapshot for PyQt 4.5, which is also available from Riverbank.

After you have finished downloading the source files, move them to a folder in your Users directory.  I have a special directory entitled “Applications” where I keep the source code for programs that I have manually compiled.  Note: The rest of the steps will be done from the Mac OS X terminal.

After you have moved the source code for both SIP and PyQt to this new directory, extract it by using the tar utility with the x and f options (tar –xf):

cd ~/Applications
tar -xf sip-4.8-snapshot-20090430.tar
tar -xf PyQt-mac-gpl-4.5-snapshot-20090507.tar

After you have expanded the files, it might be a good idea to rename the directories to something more manageable (like sip-4.8 and PyQt4.5:

mv sip-4.8-snapshot-20090430 sip-4.8
mv PyQt-mac-gpl-4.5-snapshot-20090507 PyQt4.5

First, we need to compile and install SIP.  The default configuration will move the compiled files to a directory where Leopard can’t read them.  So, we will manually need to specify the destination directory using the –d flag:

cd sip-4.8
python configure.py -d /Library/Python/2.5/site-packages

After the configuration is finished, run make and sudo make install:

make
sudo make install

Once SIP has finished installing, we need to repeat the process for PyQt.  From the sip-4.8 directory, chagne to PyQt:

cd ..
cd PyQt4

Next, run the configuration script specifying the path to your version of qmake and the installation directory for the python bindings:

python configure.py -q /usr/bin/qmake-4.5 -d /Library/Python/2.5/site-packages/

Then compile and install:

make
sudo make install

Since Qt is a rather large framework, it may require between 15 and 30 minutes to fully compile.

 | 2:00 pm

On Windows, we first need to install the pre-compiled Qt binary package from the Trolltech downloads page.  Select the LGPL/Free tab and choose the “Qt SDK: Complete Development Environment” for Windows.  It is a fairly hefty package (167 megabytes) and may take a while to transfer.  The installer includes the core Qt libraries, several important C++ development tools, and an integrated development environment (IDE) called Qt creator.  Luckily, the same installer will work on Windows XP, Windows Vista and Windows 7.

Qt and MinGW

When you run the installer, it will ask for the installation destination.  To avoid future problems, this folder should not include any spaces. The default is “C:\Qt\2009.02”,  but because I like to keep the root directory of my drive as clean as possible, I changed this to “C:\Qt\Windows”.  To begin the installation, press “Install.”

Change the installation directory to C:\Windows\Qt

The installer will also download and configure the GNU compiler toolset for Windows (also known as MinGW).  By default, MinGW will be installed in a subdirectory under the main Qt directory.  In this example, it was installed to “C:\WINDOWS\Qt\mingw”.  Note this file path, it will be important to the steps below.

Python

After you finish with the Qt framework, the next step is to download and install Python.  You can get the Python interpreter and other associated files from the Python release download page.  While the newest version is Python 3.0.1, I would recommend that you download and install Python 2.5.4 instead.  When trying to set up SIP with Python 2.6.2 and Python 3.0.1, I experienced a number of errors.  These were resolved by using the x86 version of Python 2.5.

When you first run the installer, it will ask whether you want to install Python for all users or for just the current users.  Make sure that “install for all users” is selected.  Like Qt, you should install Python to a directory without spaces.  The default install location will be “C:\Python25”, though I typically change it to “C:\Windows\Python25” so that it matches the installation directory for Qt.

Make sure that the "Install for all users" option is selected

Change the installation directory to "C:\Windows\Python25" so that it is similar to the installation directory of Qt

After the Python installer has finished running, you will need to modify one of the Windows environment variables.  On Windows XP, the easiest way to access the environment variables to right click on “My Computer” and select “Properties” from the drop own menu.  When the “System Properties” window launches click on the “Advanced” tab and then press the “Environment Variables” button.   (If you are using Windows Vista or the Windows 7 beta, the easiest way to access the Environment Variables is to go to Control Panel and Search for “Path” in the search bar.  All other steps remain the same.)

You can open the "Environment Variables" by right clicking on "My Computer" and selecting "Properties."  Then choose the advanced tab and click on the "Environment Variables" button

From the “System Variables” list, select the Path Variable and then press “Edit”.

Select Path from the "System Variables" list and select "Edit"

In the variable value box, type in the path to the Python folder you just installed, the path to the “bin” folder of MinGW (which should be under the MinGW folder from above), and last the path to a Qt program called qmake (which is located in the qt\bin folder of the Qt installation) .  You should separate these values from the other paths (and from each other) by a semi-colon.  If you have followed all of the instructions in this tutorial, you will add:

C:\Windows\Qt\mingw\bin;C:\Windows\Python25;C:\Windows\Qt\qt\bin

And your path variable should now look something like:

%SystemRoot%\System32; C:\Windows\Qt\mingw\bin; C:\Windows\Python25; C:\Windows\Qt\qt\bin

Press “Ok” to close the dialog box and then “Ok” again to close first the Environment Variables and then the “System Properties Pane.”  Now that you have changed the Path variable, you need to restart your computer so that the changes can take effect.

SIP and PyQt

Prior to compiling PyQt, you will first need to compile and install SIP, which can be found at Riverbank Software.  Though the stable version of the software is currently (4.7.9), you will need to download the latest Windows source code  snapshot of version 4.8 (Qt 4.5 doesn’t work with earlier versions of SIP).  (Update: Alternatively, you can also find a copy of the 4.8 snapshot here.)  You will need the most recent source code snapshot of PyQt 4.5, which is also available from Riverbank.  (Update: Multiple people seem to have had problems installing the 4.5 snapshot from the Riverbank website.  As a matter of convenience, you can find the April 30, 2009 version here.  This is the same version that was used in the writing of this article.)

After you have finished downloading the source files, extract them to the Q
t installation directory (C:\Windows\Qt in this tutorial).  This might also be a good time to rename the directories to something shorter than the default snapshot names.  I have changed mine to PyQt4 and sip-4.8.

The rest of the steps will be run from the command line prompt (cmd.exe) and must be run with administrator privileges to work properly.  Start by going to the start menu and then choosing the run command; then type “cmd.exe” and press enter.  (If you are on Windows Vista, you can open the command prompt by typing “cmd.exe” in the search dialog of the start menu.  To run with administrator privileges, right click on the top program choice and select “Run as administrator” from the context menu.)

At the command line, first, go to the Qt installation directory by typing:

cd C:\Windows\Qt

Then go to the SIP directory:

cd sip-4.8

Prior to compiling, we need to create a proper configuration file:

python configure.py –p win32-g++

After the configuration is done, compile the new file by typing:

mingw32-make

This will take some time.  After the files have finished compiling, you can install them by typing:

mingw32-make install

The process is repeated for PyQt.  Change to the PyQt4 source directory by typing:

cd ..
cd PyQt4

Then configure the make files, compile and install by typing:

python configure.py –p win32-g++
mingw32-make
mingw32-make install

Since Qt is a rather large framework, it may take between 15 and 30 minutes to fully compile.

Update: If you only want to run PyQt programs, the installation process can be greatly simplified by using the automated PyQt installer for Windows.  The installer will automatically install a copy of the Qt framework and the PyQt bindings.  You will need to install Python separately, however.