By Ingeniweb. A Django site.
Juillet 19, 2009
» The strange world of packaging – forking setuptools


Again, like a year ago, people had enough of the fact that the setuptools project is not maintained since 9 months.

Phillip Eby explained that he doesn’t have time to do it unless someone would pay him for that. But in the meantime, he doesn’t bless anyone to do it. Well, he has blessed some people to do it (Ian Bicking and Jim Fulton), but unfortunately these people are not willing to do it because they have a lot of other projects going on. Other people that could maintain it, including me, fail in his “unqualified people” category :)

So we are all locked in a strange situation where tons of patches are ready to be commited in the setuptools tracker but are not making it. Several non-public forks have started to appear around of course.

So again, I decided with some other people to create a fork called “Distribute”. It’s a real fork located here : http://bitbucket.org/tarek/distribute.

By real I mean that this fork was not created with the purpose of forcing Phillip to do a release like we did last year for the 0.6c9 release, but with the intention to free us from that strange situation where we all depend on his wills and (lack of) time.

The plan is to release a first version next week, that corresponds to the setuptools 0.6 branch, with some patches applied.

Next, we are planning to start a 0.7 version where the code will be splitted in several distributions:

  • a distribution for pkg_resources
  • a distribution for the setuptools package itself
  • a distribution for easy_install

A little bit of bikeshedding is going on to pick a name for that fork, and we ended up running a poll. (vote!)

Now, right after I have announced this plan on Distutils-SIG, Phillip reacted by annoucing a similar plan, e.g. splitting the setuptools project in several distributions. But since he previously said that he didn’t have the time to do it, I doubt that it’ll work out unless he’s opening its development to a wider range of developers and maintainers.

That’s the strange world of packaging…

Mars 26, 2009
» Packaging Survey first results


Around 570 people answered the survey, which is a great number I didn’t expect. Thanks again to Massimo for his help on this.

I have a lot of work to read all the open question answers, and all the comments that goes with the “other” answer, but I wanted to publish the results of the closed questions before the summit.

I don’t want to comment the results yet. I will after I have studied all answers, so it’ll be a little while ;)

Who are you ?

Professional developer using Python exclusively.
283
Professional developer using Python unable to use Python “at work”.
34
Professional developer using Python sometimes.
196
Hobbyist using Python.
116

Where are you located ?

USA
212
Western Europe
268
Eastern Europe
42
Asia
18
Africa
9
Other
70

If you are a web programmer, what is the framework you use the most ?

Pylons
55
TG 2
14
TG 1
15
Django
184
Zope (including Plone)
137
Other
207

How do you organize your application code most of the time ?

I put everything in one package
171
I create several packages and use a tool like zc.buildout or Paver to distribute the whole application
137
I create several packages and use a main package or script to launch the application
198
I use my own mechanism for aggregating packages into a single install.
67

For libraries you don’t distribute publicly, do you you create a setup.py script ?

Yes
321
No
249

What is the main tool or combination of tools you are using to package and distribute your Python application ?

None
80
setuptools
150
distutils
127
zc.buildout and distutils
10
zc.buildout and setuptools
107
Paver and setuptools
9
Paver and Distutils
3
Other
64

How do you install a package that does not provide a standalone installer (but provides a standard setup.py script) most of the time ?

I use easy_install
241
I download it and manually run the python setup.py install command
139
I use pip
34
I move files around and create symlinks manually.
7
I use the packaging tool provided in my system (apt, yum, etc)
81
Other
33

How do you remove a package ?

manually, by removing the directory and fixing the .pth files
275
I use one virtualenv per application, so the main python is never polluted, and only remove entire environments.
154
using the packaging tool (apt, yum, etc)
178
I don’t know / I fail at uninstallation
79
I change PYTHONPATH to include a directory of the packages used by my application, then remove just that directory
31
Other
10

How do you manage using more than one version of a library on a system ?

I don’t use multiple versions of a library
217
I use virtualenv
203
I use Setuptools’ multi-version features
46
I build fresh Python interpreter from source for each project
16
I use zc.buildout
109
I set sys.path in my scripts
48
I set PYTHONPATH to select particular libraries
49
Other
23

Do you work with setuptools’ namespace packages ?

Yes
178
No
344

Has PyPI become mandatory in your everyday work (if you use zc.buildout for example) ?

Yes
228
No
294

If you previously answered Yes, did you set up an alternative solution (mirror, cache..) in case PyPI is down ?

Yes
77
N/A
277
No
166

Do you register your packages on PyPI ?

Yes
239
No
281

Do you upload your package on PyPI ?

Yes
205
No
314

If you previously answered No, how do you distribute your packages ?

One my own website, using simple links
139
One my own website, using a PyPI-like server
50
On a forge, like sourceforge
N/A
251
Other
56

Mars 9, 2009
» Take the Python Packaging Survey


The Python Langage Summit is coming up. To prepare this event, I have put online a survey you can take to tell us a bit more about you and how you package your Python applications.

Thanks to all the people that helped building the survey, and a special thanks to Massimo Di Pierro who created the application that runs the Survey and helped me set up the survey. It runs under web2py.

Novembre 26, 2008
» Python package distribution - my current work


I found a bit of time to work on distribution matters. Here’s a status of what I am doing there.

There are two topics I am focusing on right now.

  • clean up and enhance Python’s distutils package
  • implement the mirroring infrastructure at PyPI

distutils work

Nathan Van Gheem proposed a cool patch in collective.dist, (this package is a port of the new features I have added in distutils so they are available in 2.4 and 2.5).

Nathan proposed a patch to be able to avoid the storage of the password in the .pypirc file. The prompt is used in that case. This is something that was in my pile for a long time.

I have added a few things to Nathan’s patch, and a test, and proposed it to Python. I am now waiting for its integration in 2.7 trunk: http://bugs.python.org/issue4394. If it’s accepted, I will backport it to collective.dist.

There are some other tickets I am waiting to be accepted:

I am not sure when those will be integrated. The average time for the integration of tickets in distutils in Python is between 6 months and 8 months. hihihi. :D

PyPI mirroring

The job I am doing in PyPI will be in three phase :

  • Phase 1: implement the mirroring infrastructure in PyPI
  • Phase 2: promote it, and propose patches for the mirroring tools out there so they use the protocol
  • Phase 3: promote and propose patches for pip so it can use the mirrors efficiently (fail-over and nearest mirror infrastructure).

Phase 1: so far, so good.

With some insights from Richard Jones and Martin von Löwis, I am currently implementing the mirroring infrastructure for PyPI we have defined during the D.C. sprint (I still owe a blog entry about this sprint). The code lives in a branch on the python svn folder dedicated to PyPI.

The idea of the mirroring infrastructure is to be able to get a list of official mirrors for PyPI, that can be used as alternatives sources . (It is described here: http://wiki.python.org/moin/PEP_374). A great behavior could be that the client application interacts with the nearest mirror location automatically, and switch to another if it goes down.

So, a list of mirrors will be made available at /mirrors, and the client applications will be able from there to use an alternative location for every package. The hardest part concerns the stats : we want to display in PyPI the download counts for each package by summing downloads from every mirror.

So every mirror will have to provide its “local stats” that can be visited by PyPI. That’s the biggest part of the work I am doing. It will build the stats for PyPI by parsing its Apache log file. And hopefully, this code should be reusable by the mirrors themselve so they can build their stats the same way.

Of course this infrastructure could be used for any PyPI-compatible server even if is not a mirror of PyPI (like a private PyPI server)

Phase 2 will consist in promoting the infrastructure to the mirroring softwares out there. Maybe Pycon will be a good place for that.

Phase 3 is the most interesting one : make sure the client applications use the mirrors ! I think Ian Bicking’s pip project could be the right place for these innovations.

Next topics in the pile:

  • index-merging: describe in a PEP-like document the index-merging feature that would allow clients to merge several indexes with a content that differe. For example: PyPI + a private PyPI server. I have written a first draft of such a patch in setuptools in the past (http://bugs.python.org/setuptools/issue32) but I have lost all my hopes to see this project moving forward lately.
  • Brainstorming: try to understand the Python Packaging Paradox. That is = how come the community, which is composed of many briliant people, is unable to move forward in packaging matters.
  • Distribute the return :D
      

Décembre 26, 2007
» PloneSoftwareCenter Christmas mini-sprint


I made a mini-sprint on PSC for Christmas, since everyone around me was sitting watching christmas movies on TV and trying to digest.

Here’s a wrapup for comment, and for upcoming work.

Current branch (pypi)

I’ve merged Sidnei’s work into a new branch, with the current trunk since his work was done 2 years ago. I have made a few changes from his initial implementation:

  • the PyPI API is now coded in a browser view instead of a persistent object, since it has no properties to keep at all;
  • when a release is uploaded, a new release object is created for the given version if it doesn’t exists instead of raising an error and asking the user to manually create it inside the PSC;
  • the doctest was simplified and uses sample tarballs and eggs.

I need to finish up a few things and to add some features such as:

  • automatic project creation. When a package is uploaded and no project corresponds to it, a new project is created using the egg name and provided metadata. This will make the PSC acts like the CheeseShop. (an option will be added in PSC to activate/deactivate this feature to prevent automatic creation of projects if not wanted).
  • trove web service the TROVE.txt file created by Sidnei needs to be replaced by a call to the categories; (see next section)
  • multiple uploads. Make sure everything works fine when several files are uploaded for one release;
  • more tests I need to write more tests from various clients and platform to make sure it works good. (by recording setuptools/distutils calls and creating tests with this).

This work should be done this week if everyone is OK with what I have proposed.

About the Trove classification

The Cheeseshop provides a Trove classification (see http://www.python.org/dev/peps/pep-0301) which evolves. For instance the “Django framework” category was added last week IIRC.

Obviously, Plone eggs should follow this classification but when they are uploaded in a PloneSoftwareCenter they might find specific categories defined locally (these categories might be specific to the project). I think we should let people freely define their classifiers in setup.py and let each server take the ones they have in their list.

The problem with Cheeseshop implementation is that it fails silently when a item in the ‘classifiers’ list doesn’t exists on the server side. The package metadata seem to be lost after that. (this looks like a bug to me, I didn’t digg the PyPI code yet). I need to ask over distutils-sig about this and see if we can come up with a Cheeseshop that will pick the categories it knows out of the classifier list, and let the other alone. This would allow PSC to deal with extra categories.

Then the PSC will have to implement the trove web services and serve its categories, so the “list-classifiers” option of the register command works.

Until then I guess we can leave the classification settings manual.

About the .pypirc file

This file that is used with the register command is working just for one server and will not allow having several set of login/password. This is not a problem when the login of your plone with your PSC is the same than the one in PyPI. Otherwise it won’t work.

Furthermore, this command is using a hardcoded ‘pypi’ realm if you look at distutils/commands/register.py:

auth.add_password('pypi', host, username, password)

The real solution here is to make distutils evolve so the .pypirc file allows having several login/password for each server, using the host url and the realm (the realm can be queried automatically too). Until then we have to make the PyPI api return a ‘pypi’ realm on 401 error (and this was done by Sidnei’s work).

To avoid maintaining several .pypirc files and forcing the realm on 401 errors though, we should create a new register command, that can work with a enhanced version of the file and allow adding passwords for several hosts. IMHO, the disutils package code is PyPI-centric but was primarly intented to work over any release server, so it has to evolve on that point.

About sdist and bdist_egg

We have discussed in my latest entry about having a new command to upload the package in the PSC:

 $ python setup.py plone_upload

The idea is to be able to upload the release to plone.org and to the Cheeseshop in one step. People reacted over this because in my example I used bdist_egg instead of dist for the packaging. I think it’s a false debate because it’s up to the developer to decide how he releases his work, using a tarball that is compiled by the target system, and/or an egg.

So we can just define an enhanced upload command that replaces the original one, to automate the upload on several servers, and let the developer manually call sdist and/or bdist_egg.

Servers could be picked up by the user at the prompt.

Schedule & tasks

This are my plans in PSF this week and next week:

  • finish my current work on the branch, so the basic implementation works;
  • provide an enhanced version of the register and upload command, for multiple servers uploads. This will be done in a new package, since it’s more like a distutils enhancement;
  • implement the trove webservice using PSC categories.