By Ingeniweb. A Django site.
Décembre 16, 2008
» Pycon 2009 talks


I have 2 accepted talks at Pycon, that is great. I would like to say that the Pycon review system is awesome because you can see what the reviewers have said, and understand why your talk was accepted or declined.

I was a bit frustrated that my Atomisator talk was declined, but I think it makes sense : this is a new tool, and beside my user group and a few people, it is not really used yet.

One reviewer said that it had to be picked, and another one answered :

I agree that PyCon should not restrict itself to well-known projects, but it should definitely restrict itself to projects that are (a) in production use, (b) under active development, and (c) likely to still be so in a year. There are so many projects meeting these criteria that for me, the bar is very high indeed to spend a talk slot on one that does not.

Ok, fair enough : I will present this talk at Pycon 2010 and they won’t have any argument to decline it ;)

The talks that made it:

  • How AlterWay releases web applications using zc.buildout
  • On the importance of PyPI in delivering and building Python softwares - mirroring, fail-over and third-party package indexes

I will get into greater details later on.

      

Décembre 15, 2008
» Python Isolated Environment (PIE)


Here’s a proposal I will send to the python-dev. What do you think ?

(Disclaimer : this proposal is highly inspired from the work done by people in various tools, it does not reinvent anything)

The problem

Python developers distribute and deploy their packages using myriads of dependencies. Some of them are not yet available as official OS python packages.  Even sometimes one package conflicts with the official version of a package installed in a given OS.

In any case, the cycle of development of most Python applications is shorter than the release cycle of Linux distributions, so it is impossible for application Foo to wait that Bar 5.6 is officialy available in Debian 4.x.

Therefore, there’s a need to provide or describe a specific list of dependencies for their application to work.

And this list of dependency might conflict with the existing list of packages installed in Python. In other words, even if this is not a wanted behavior from an os packager point of view, an application might need to provide its own execution context.

Right now, when Python is loaded, it uses the site module to browse the site-packages directory to populate the path with packages it find there.  .pth files are also parsed to provide extra paths.

Python 2.6 has introduced per-user site-packages directory, where you can define an extra directory, which is added in the path like the central one.

But both will append new paths to the environment without any rule of exclusion or version checking.

The workarounds

A few workarounds exist to be able to express what packages (and version) an application needs to run, or to set up an isolated environment for it:

  • setuptools provides the install_requires mechanism where you can define dependencies directly inside the package, as a new metadata. It also provides a way to install two different versions of one package and let you pick by code or when the program starts, which one you want to activate.
  • virtualenv will let you create an isolated Python environment, where you can define your own site-packages. This allows you to make sure you are not conflicting with a incompatible version of a given package.
  • zc.buildout relies on setuptools and provides an isolated environment a bit similar in some aspects to virtualenv.
  • pip provides a way to describe requirements in a file, which can be used to define bundles, which are very similar to what zc.buildout provides.

But they all aim at the same goal : define a specific execution context for a specific application, and declare dependencies with no respect to other applications or to the OS environment.

This proposal describes a solution that can be added to Python to provide that feature.

The solution

A isolated environment file that describes dependencies is added. This file can be tweaked by the application packager, or later by the OS packager if something goes wrong.

The isolated environment file

A new file called a  Python Isolated Environment file (PIE file) can be provided by any  application to define the list of dependencies and their versions.

It is a simple text file with a first line that provides :

  • a list of paths, separated by ‘:’, on line 1
  • then one package per line, starting at line 2. each package can be prefixed by a `!`

For example:

/var/myapp/myenv
lxml
sqlite
sqlalchemy
!sqlobject

This list of packages might or might not be installed in Python.

Versions can be provided as well in this file :

/var/myapp/myenv:/var/myapp/myenv2
lxml >= 0.9
sqlite > 1.8
sqlalchemy == 0.7
!sqlobject == 0.6

The file is saved with the pie extension,

Loading an isolated environment file

A new function called load_isolated_environment is added in site.py, that let you load a PIE file.

Loading a PIE file means:

  • for each package defined, starting at line 2, load_isolated_environment will look into the environment if the package with the particular version exists. The version is given by the package.__version__ value or the PKG-INFO one when available. If the package exists but the version is not available, the version 0.0 is used.
  • for packages without the ! prefix:
    • if the  package is not found, it will scan each path provided on line 1 of the file, using the site-packages method, looking for that package.
    • if the package is found, it is added in the path.
    • if the package is not found, a PackageMissing error is raised.
  • for packages starting with the ! prefix:
    • if the  package is found, it is removed from the path

This function can be called by code like this:

>>> from site import load_isolated_environment
>>> load_isolated_environment('/path/to/context.pie')

From there, sys.path meets the requirements and the code that is executed after this call will benefit from this context.
Another context can be loaded in the same process :

>>> load_isolated_environment('/path/to/another_context.pie')

Limitations:

  • if the new context brakes other programs in the process. It’s up to the application packager to fix the context file.
  • it’s not the job of load_isolated_environment to resolve dependencies issues : if the foo package needs the bar package, it won’t complain.
  • it is not the job of load_isolated_environment to get missing dependencies.

Using an isolated environment file

Typically, an isolated environment file can be used into high-level Python scripts. For example, any script an application provides to be launched :

# this script runs zope
from site import load_isolated_environment
load_isolated_environment('zope-3.4.pie')

import zope

if __name__ == '__main__':
    zope.run()
      

Décembre 12, 2008
» Pycon 2009 proposals


The proposal acceptance date is in a few days.

Here are the four proposals I have made:

  • The state of packaging in Python. This discussion resumes the current options when it comes to distribute your packages. It also explains the pitfalls and the gap between the Python developers and the OS vendors and packagers. I think this talk will not be picked because the topic is wide and vague. So I proposed to transform it into a panel where lead developers from various framework could explain their usage of distutils and what is missing to make them happy. No feedback yet on this.
  • Atomisator, the agile data processing framework. This tool is starting to be useful, and I think it can be useful to others. Check http://atomisator.ziade.org for a quick overview.
  • How AlterWay releases web applications using zc.buildout. That is the same talk I gave at the Plone conf but I present it in a way people understand zc.buildout is not tied to Zope and Plone and can be used with any other application. As a matter of fact, it has become a standard here, and we use it for Pylons, etc..
  • On the importance of PyPI in delivering and building Python softwares - mirroring, fail-over and third-party package indexes. That’s a long title. It presents my work on PyPI.

Last, I will go to the Python Language Summit the day before Pycon. I volunteered to be a “champion” on distutils matters.

      

Novembre 27, 2008
» Expert Python Programming Book : typo sprint tonight !


I love Packt. As soon as I have told them that some people liked the book but complained about the typos, they proposed to go ahead and launch a new print cycle.

Basically it means that the next buyers will have a typo-free book. At least for all the typos that were reported on my Trac here, or at Packt’s.

I am currently processing all the typos reported at Packt so I have a full list on my wiki, and will provide them the final list tomorrow.

Then, they will re-print it.

So if you already own the book, and you see a typo that is not listed, please let me know.

      

Novembre 26, 2008
» Python package distribution - my current work


I found a bit of time to work on distribution matters. Here’s a status of what I am doing there.

There are two topics I am focusing on right now.

  • clean up and enhance Python’s distutils package
  • implement the mirroring infrastructure at PyPI

distutils work

Nathan Van Gheem proposed a cool patch in collective.dist, (this package is a port of the new features I have added in distutils so they are available in 2.4 and 2.5).

Nathan proposed a patch to be able to avoid the storage of the password in the .pypirc file. The prompt is used in that case. This is something that was in my pile for a long time.

I have added a few things to Nathan’s patch, and a test, and proposed it to Python. I am now waiting for its integration in 2.7 trunk: http://bugs.python.org/issue4394. If it’s accepted, I will backport it to collective.dist.

There are some other tickets I am waiting to be accepted:

I am not sure when those will be integrated. The average time for the integration of tickets in distutils in Python is between 6 months and 8 months. hihihi. :D

PyPI mirroring

The job I am doing in PyPI will be in three phase :

  • Phase 1: implement the mirroring infrastructure in PyPI
  • Phase 2: promote it, and propose patches for the mirroring tools out there so they use the protocol
  • Phase 3: promote and propose patches for pip so it can use the mirrors efficiently (fail-over and nearest mirror infrastructure).

Phase 1: so far, so good.

With some insights from Richard Jones and Martin von Löwis, I am currently implementing the mirroring infrastructure for PyPI we have defined during the D.C. sprint (I still owe a blog entry about this sprint). The code lives in a branch on the python svn folder dedicated to PyPI.

The idea of the mirroring infrastructure is to be able to get a list of official mirrors for PyPI, that can be used as alternatives sources . (It is described here: http://wiki.python.org/moin/PEP_374). A great behavior could be that the client application interacts with the nearest mirror location automatically, and switch to another if it goes down.

So, a list of mirrors will be made available at /mirrors, and the client applications will be able from there to use an alternative location for every package. The hardest part concerns the stats : we want to display in PyPI the download counts for each package by summing downloads from every mirror.

So every mirror will have to provide its “local stats” that can be visited by PyPI. That’s the biggest part of the work I am doing. It will build the stats for PyPI by parsing its Apache log file. And hopefully, this code should be reusable by the mirrors themselve so they can build their stats the same way.

Of course this infrastructure could be used for any PyPI-compatible server even if is not a mirror of PyPI (like a private PyPI server)

Phase 2 will consist in promoting the infrastructure to the mirroring softwares out there. Maybe Pycon will be a good place for that.

Phase 3 is the most interesting one : make sure the client applications use the mirrors ! I think Ian Bicking’s pip project could be the right place for these innovations.

Next topics in the pile:

  • index-merging: describe in a PEP-like document the index-merging feature that would allow clients to merge several indexes with a content that differe. For example: PyPI + a private PyPI server. I have written a first draft of such a patch in setuptools in the past (http://bugs.python.org/setuptools/issue32) but I have lost all my hopes to see this project moving forward lately.
  • Brainstorming: try to understand the Python Packaging Paradox. That is = how come the community, which is composed of many briliant people, is unable to move forward in packaging matters.
  • Distribute the return :D
      

Novembre 22, 2008
» How to be disappointed with the “printed” in “printed book”


I feel really bad about this comment on my book : How To Be Dissappointed in Something You Recommend.

Just a quick word about the try, return finally code pattern, since I had some feedback about it. I would like to mention that this code pattern is perfectly right:

def function():
    try:
      return something
    finally:
      do something

I should have explained it better, because this pattern is not used a lot by people, so you can think that “do something” is called after the return of the function, which is not the case.

For the typos now:

The first thing I did wrong: when I started the book, I wanted, as I did in my previous book, to run unit tests on the book itself to avoid those mistakes. That said, the previous one was in Latex, which is quite simple to interact with, and this one is in OpenOffice, because that is how the editor works. I had to write a script to extract the Python code from the Ooo file, to unit test it. I didn’t. I simply ran out of time, as usual when you have deadlines on books.

The second thing I did wrong: I should have told the editor to wait a bit, I didn’t.

But Packt does Print On Demand, so I know that the Errata page I am maintaining here : http://atomisator.ziade.org/wiki/Errata, is being processed by the editor, and that the typos will be removed from the book at some point, without having to wait for a second edition.

I’ll update this blog entry as soon as I know the status on this.

I am really sorry Calvin, and all the people that are suffering from these typos.

      

Novembre 9, 2008
» How to receive email alerts when someone talks about something - 6 steps tutorial using Atomisator


I like Google Alert, the idea of receiving a mail every