By Ingeniweb. A Django site.
Février 6, 2010
» Plone and multilingual sites

Usually we build multilingual Plone sites with LinguaPlone.

This solution has a big advantage, it’s generic and very easy to implement in a plone site.

But there are many inconvenients, due to the design of this product (translations are independent by design) :

  • Each translation is a new Archetype object, and it could be a big problem on sites with many contents, the portal objects number is increased by the number of available languages.
  • Translations uses plone references catalog to be linked to the original (called canonical) object, but when moving objects translations are not moved, when copying pasting objects, translations are not pasted, when deleting objects translations are not deleted, when reordering contents, translations are not reordered, when publishing objects translations are not published, … For web masters maintaining a site  with LinguaPlone inside could be a challenge.
  • When translating folders with LP, all translated contents are moved from the canonical folder to the translated folder with same language, translating low level folders on big depth tree sites could take a long long time, don’t be surprise if you get errors.
  • If a content is neutral (with no language attribute), inside a translated folder, it could not be seen when browsing a translation of the parent folder.
  • At last a lower problem :  the translation edit forms are not pretty to use, they show a table with two columns, the first column with the « canonical » content inside in « view » mode, the second column with the translation edit form, in fixed width sites the translated form width is sometimes ridiculously small.

But since many years we use LinguaPlone because it was the only easy way to make multilingual Plone sites.

By the past we were using a LP patch called LinguaFace to reduce the number of problems with LP (synchronisation on reorder, copy-paste, delete, or move – see neutral contents inside all translated folders – more usable translation edit forms …), but LF add a new layer of complexity and maintaining it with all LP versions becomes complicated. See some examples on how it works :

  • when a content is copied, all translations are copied
  • when a content is pasted or moved all translations are pasted or moved at the good place (not so easy)
  • when a content is published or retracted translations follow the same workflow transition
  • when a folder is translated, we don’t see only objects inside but also objects with same canonical Path (a new catalog index) to see also neutral contents.
  • Navtree is patched, breadcrumbs are patched, to use canonical path
  • and so on …

A big nightmare.

To day a new solution exists that store translations inside each Archetypes field, raptus.multilingualfields, and a Plone integration of raptus.multilanguagefields called raptus.multilingualplone that extends the schema of all Plone Content Types making them translatable. raptus.multilingualfields also provides multilingual catalog indexes that return the good translated data when searching for contents or displaying trees, and multilingual criterions for topics.

A LP feature not provided by raptus.multilingualfields is the internationalization of urls, if you really need this feature, i think it’s not a big challenge to add some traversal rules, for me it’s not essential.

AnotherLP  feature that can’t be provided when storing translations in fields is to get different workflows or security settings for each translation. If you need this feature use LinguaPlone, LinguaPlone is done for that, it makes all translated contents independent, but i’m curious to know the number of users who really want this feature, all clients i had never ask for that but finally, after some LP experience, always wanted the exact opposite use-case.

To make your Plone site translatable with raptus.multilanguagefields, you have two choices :

  1. Add raptus.multilanguageplone in your buildout and install it in your Plone Site using extensions products control panel, to make all your Plone content types and derived translatable (all fields for which translation make a sense are translatable).
  2. I prefer integrate by myself raptus.multilanguagefields inside a product, since we could want just some fields or content types translatable, as example i don’t need to translate the images or the files contents, just their titles and descriptions.

How to implement the second solution ?

Just take a look at raptus.multilanguageplone code, it’s easy :

  1. Make your archetype extenders to make your wanted fields translatable, example can be found here
  2. register your extenders in setuphandlers (Generic Setup import step), example here
  3. replace the standard catalog  indexes with multilingual indexes in Generic Setup profile, example here

That’s all, you will get superb edit forms with translatable fields inside, you also will get a google help to translate contents (a pleasant gadget). I say « Bravo » to raptus developpers.

Important :

  • these products are young, and there’s still many work todo to make it work without problems (tests are needed …). Last 0.6 releases have bugs under plone3.3, use the svn versions below instead.
  • raptus.multilanguageplone 0.6  has a bug in extenders with primary fields, tested in Plone 3.3  (fixed in branch aws_evols, not tested with images at this time)
  • raptus.multilanguagefields 0.6 has a bug, doctests are broken in Plone3.3 (fixed in trunk )
  • At this time these products don’t have unit tests or functional tests, it’s the only reproach i can make.  I started the work  here, and here

Janvier 12, 2010
» DateTime against mx.DateTime

I notice that in a zope application that I have to maintain is slow because of DateTime class.
The profile in this application test give the top time to this class.
So I want to test an other implementation which is name mx.DateTime. The difference is that mx.DateTime is writen in C.
So in a terminal , I install the two eggs via easy install :

./bin/easy_install DateTime
./bin/easy_install egenix-mx-base

And do a little script for testing the two api:

import sys
from mx import DateTime as mxDateTime
from DateTime import DateTime
from datetime import datetime
from time import time

def create_mxdatetime():
    return mxDateTime.now()

def create_zopedatetime():
    return DateTime()

def create_datetime():
    return datetime.now()

def bf(f, i):
    t1 = time()
    for i in xrange(i):
        f()
    t2 = time()
    print "bench for %s is %s" % (str(f), t2 - t1)

bf(create_mxdatetime, int(sys.argv[1]))
bf(create_zopedatetime, int(sys.argv[1]))
bf(create_datetime, int(sys.argv[1]))

This script just create Date in the three implementation : zope, mx and the standard library

And the results:

bash-3.2$ bin/python bench.py 1000
bench for function create_mxdatetime at 0x7db70 is 0.00120091438293
bench for function create_zopedatetime at 0x4215f0 is 0.84446310997
bench for function create_datetime at 0x4214b0 is 0.00220394134521
bash-3.2$ bin/python bench.py 10000
bench for function create_mxdatetime at 0x7db70 is 0.0117778778076
bench for function create_zopedatetime at 0x4215f0 is 8.81699991226
bench for function create_datetime at 0x4214b0 is 0.041069984436
bash-3.2$ bin/python bench.py 100000
bench for function create_mxdatetime at 0x7db70 is 0.11746096611
bench for function create_zopedatetime at 0x4215f0 is 87.8845770359
bench for function create_datetime at 0x4214b0 is 0.222129106522

No comment.. and in memory ?

So

bash-3.2$ bin/easy_install pympler

and now::

>>> from pympler import asizeof
>>> from datetime import datetime
>>> from mx import DateTime as mxDateTime
>>> from DateTime import DateTime
>>> asizeof.asizesof(DateTime() , mxDateTime.now(), datetime.now())
(1760, 56, 32)

Hoaaa !! a zope DateTime is 798 time slower than mxDateTime and it consume 31 more space than mxDateTime

I think ReplacingDateTime could be a good performance issue for zope, no ?

With DateTimeNG (zope DateTime with mx.DateTime) performance is 10 time better than DateTime. Memory consume is the same

From mx.DateTime documentation
Comparing the types to time-module based routines is not really possible,
since the used strategies differ. You can compare them to tuple-based
date/time classes though: DateTime[Delta] are much faster on creation, use
less storage and are faster to convert to the supported other formats than
any equivalent tuple-based implementation written in Python.
Creation of time-module values using time.mktime() is much slower than
doing the same thing with DateTime(). The same holds for the reverse
conversion (using time.localtime()).
The storage size of ticks (floats, which the time module uses) is about 1/3
of the size a DateTime instance uses. This is mainly due to the fact that
DateTime instances cache the broken down values for fast access.
To summarize: DateTime[Delta] are faster, but also use more memory than
traditional time-module based techniques
.


Décembre 23, 2009
» Migrer vos vieux sites vers Plone 3 ou Plone 4


Vous êtes nombreux à avoir ce souci, et c’est pareil pour moi, de migrer vos anciens sites CPS ou Plone2 vers une technologie plus mature et plus pérenne, à savoir Plone3 ou Plone4.

Ce billet n’a pas la prétention de résoudre ces questions avec un framework sophistiqué et néanmoins intéressant appelé Products.contentmigration mais plus humblement de vous donner quelques astuces bien pratiques.

Par exemple vous ne savez pas comment supprimer ces maudites Access Rules, qui, c’est pas faute de vous avoir prévenu étaient dangereuses, alors c’est facile, vous avez besoin d’un petit import :

from ZPublisher.BeforeTraverse import unregisterBeforeTraverse

et sur chaque objet migré :

rules = unregisterBeforeTraverse(obj, 'AccessRule')
if rules:
     try: del getattr(obj, rules[0].name).icon
     except: pass

Passons à une chose stupide mais lourde de conséquences, lorsque vous migrez des contenus d’un meta_type vers un autre, la plupart du temps il faut copier-coller l’objet et lui ré-attribuer tous ses anciens attributs, mais supposons que l’objet est structurellement le même ou presque, pourquoi s’embêter (imaginez dans le cas d’un dossier à la racine d’un site comme ça peut faire mal), il suffit de :

obj.__class__ = new_class

vous êtes assez malins pour trouver new_class et new_portal_type…

et dans le cas d’un Archetypes object à l’arrivée

obj.updateSchema()

C’est un exemple à ne pas suivre, mais comme tous les exemples de ce type il est bien pratique (ça se compte en jours/semaines de prise de tête en moins)

Sans doute qu’il y aura une suite à ce ticket.

Décembre 9, 2009
» How to add a counter without conflict error in zope ?

I notice that in charge , counter in Zope2 can generate some conflict error.

Why ?

Because two thread want to change the value of a variable. Conflict error are exposed here :
http://wiki.zope.org/zope2/ConflictErrors

But in certain case it’s useful to have an global counter that increment in certain operation. Cache Fu have those counter for caching purpose. Or in charge those counter generate some Conflict Error. There is an solution : resolve the conflict by hand.

In zope source I notice that there is an class that implement this use case. I try to add an counter with this implementation. And I test two implementation of the two counter : one wich was simply an int and one wich was Products.Transience.Transience.Increaser.

The init code of the first counter looks like that :

tool.counter = Products.Transience.Transience.Increaser(0)

The second counter looks like that:

tool.counter2 = 0

I test under siege the incrementation of the two implementation:

First test (Increaser)

Transactions:		         200 hits
Availability:		      100.00 %
Elapsed time:		        6.68 secs
Data transferred:	        0.00 MB
Response time:		        0.10 secs
Transaction rate:	       29.94 trans/sec
Throughput:		        0.00 MB/sec
Concurrency:		        2.97
Successful transactions:         200
Failed transactions:	           0
Longest transaction:	        0.44
Shortest transaction:	        0.01

O conflicts errors

Second test (int)

## 20 user

Transactions:		         200 hits
Availability:		      100.00 %
Elapsed time:		       11.35 secs
Data transferred:	        0.06 MB
Response time:		        0.26 secs
Transaction rate:	       17.62 trans/sec
Throughput:		        0.00 MB/sec
Concurrency:		        4.56
Successful transactions:         197
Failed transactions:	           0
Longest transaction:	        6.09
Shortest transaction:	        0.01

49 conflicts (3 unresolved)


Novembre 13, 2009
» Memory Profiler for zope

I just release a little tool to detect Memory Leak in zope2 call Products.MemoryProfiler .

It use heapy (http://guppy-pe.sourceforge.net/#Heapy) in internal. It’s just an interface to this tool.

It provide an http interface in zope control panel to see the current memory .

When you start profiling, you take an snapshot of the memory at instant t.
When you click to updateSnapshot, memory profiling  tell you what objects are added between the start and the updateSnashot click. It will be usefull to detect Memory Leak.
Each snapshot is store (as string) in MemoryProfiler to be consult later (link to the date).

The button clear db cache clear all zeo cache of all mounting point so you can see the impact of the memory of those cache.

For windows users, you must compile guppy. There is egg for python 2.6 but
no for python 2.4. I have fatal error with Mingw to compile guppy. I hope that we have soon a binary egg to for python 2.4.

I hope that this tool give to us usefull  information to the memory consume by zope.

memory detail

Août 31, 2009
» How to configure an custom vary tag for squid

You want to make an authenticated cache with apache/squid-varnish/plone and you don’t
know how do that : it’s possible with the vary tag.

Vary header tell to proxy cache what’s headers is variant for an object for a cache.
For example if you tell to the cache that the variant is Cookie , then for a same url with different cookie value the result of the cache is different.
The Server send to the proxy (in the response) which header is considered for vary by sending Vary: list of request header name

In Cachefu, you can configure that by rule with varyExpression.

In global configuration of cache fu you can also configure an global vary header. By default this configuration is send with rule.portal_cache_settings.getVaryHeader()

You can activate or desactivate vary with the header_set configuration ( vary field).

Vary headers must be present in the request (not response) of the browser in order to be considered to be variant for the proxy cache. So we are limited with the standard header of the protocol http.

But with cookie and apache (apache is in front of squid) we can elaborate strategy to construct a vary tag more efficient.

The second aspect of the cache work is purge content when the content change.

PURGE of Vary objects is still very poorly supported in squid, and you can only purge one variant at a time and need to get the URL cached again before being able to purge another variant. So how to deal with that also ?

First , how build our vary tag ?

The trick is to construct an custom vary tag with apache.
We can to do this with RewriteRule::

RewriteCond %{HTTP_COOKIE} mycookie="([^"]+) [NC]
RewriteRule ^(.*)$ - [E=mycookie:%1]

So in this example mycookie contains the value of cookie_key

You can add a cookie for the language , a cookie for group , a cookie for a permission and so on and then construct your custom vary tag with values of this specifics cookies with mod_headers

RequestHeader append MyVary %{mycookie}e

And then the value of mycookie is considered to be variant..

If you want have a specific vary tag for anonymous you can test the presence of
__ac cookie and send a custom MyVary in this case

RewriteCond %{HTTP_COOKIE} __ac="([^"]+) [NC]
RewriteRule ^(.*)$ - [E=authenticated:1]

RequestHeader append MyVary %{mycookie}e env=authenticated
RequestHeader append MyVary anonymous env=!authenticated

So now with that you can vary cache as you want. Now how to treat the big deal of purge.

The trick is  have an image (or a ajax request or ..) in content that is never in cache. This image is serve by a browser view (in case of zope application) that set a cookie. This cookie value is added to Vary tag. So the Vary tag change if the value of this cookie change and then the content is updated (for all request).

For example we can construct a cookie with the value of the catalog change

catalog_count = pcs.getCatalogCount()
context.REQUEST.RESPONSE.appendHeader('Pragma','no-cache')
context.REQUEST.RESPONSE.appendHeader('Cache-control', 'no-cache')
cookie = context.REQUEST.cookies.get('X_CACHE_CATALOG', 0)

if cookie != str(catalog_count) :
context.REQUEST.RESPONSE.setCookie('X_CACHE_CATALOG',
catalog_count ,
path="/")
return catalog_count

And in apache we add
RewriteCond %{HTTP_COOKIE} X_CACHE_CATALOG=([^"]+) [NC]
RewriteRule ^(.*)$ - [E=X_CACHE_CATALOG:%1]
RequestHeader append MyVary %{mycookie}e:%{X_CACHE_CATALOG}e env=authenticated

And when catalog change, the vary also (in the second request) and the cache is updated. You can elaborate other strategies for purging vary object with this technique.

The last point is to combined Etag and Vary Header in response. IE with a Vary header don’t treat correctly Etag header and If-None-Match is never sending. So in apache remove the tag Vary and then Etag work well for all browser

Header unset Vary


Juin 9, 2009
» Retrieving UserPrincipalName with NetBiosDomain\NetBiosLogin

Sometimes the hard part of a python application is to integrate sso because there is an unknown : what rules is defined to get the user !

In windows, apache mod_sspi or enfold proxy give to us an http header ( name X_REMOTE_USER) to deal with the active directory. This header is like that:

Domain\user

If you have one domain it’s pretty simple. Your userid is unique

But in big company there is multiple domain controler. And user is not unique ! So how retrieve an unique user id for active directory and use it in my windows python application !

The response is get UserPrincipalName with COM and NameTranslate interface.

import win32com.client
d = win32com.client.Dispatch('NameTranslate')
d.Init(3,'')
d.Set(3,'domain\\user')
userPrincipalName = d.Get(9)

Now if you use COM with zope as me , COM is not thread safe. So init the client at zope starting and lock yours calls to the API

import win32com.client
import threading
D = win32com.client.Dispatch('NameTranslate')
D.Init(3,'')
COMLOCK = threading.Lock()

And in a function use the global D

def getUserPrincipalName(sso_header):

     try:
            COMLOCK.acquire()
            D.Set(3,sso_header)
            userPrincipalName = D.Get(9)
     finally:
            COMLOCK.release()
     return userPrincipalName

Youpi , thanks win32com !!


Mai 19, 2009
» Monitoring a Zope 2 application


We have a simple need for a customer project that runs a Zeo server and a few Zeo clients : being able to check the status of every Zeo client, and monitor what they are doing.

DeadlockDebugger almost provides this feature since it is able to produce a dump of the execution stack for every thread a Zope instance is running.

Based on this tool, I have developed ZopeHealthWatcher, that provides a console script to query a Zope instance, and get back a status for every running thread. It tells you if the thread is idling or if it’s running some code. The script also returns an exit code depending on the number of busy threads, so it can be used in tools like Nagios.

When there are 4 or more busy threads, the script will return the execution stacks for every busy thread and some extra info like the system load and memory info. The returned info will be extendable through plug-ins in the next version, but right now the provided info are enough for our needs.

I have also created an HTML version, so when the dump is requested from another tool than the console script (e.g. a browser), it displays a nice human-readable interface (check the PyPI page for more info and a screenshot).

Notice that DeadLockDebugger is hackish since it patches the Zope publisher at startup. But we won’t change this part: we need this tool to run from the oldest to the newest Zope 2 version. And the patch just works fine, so…

The provided version should run out of the box in a buildout-based Plone 3 application, but requires manual installation steps on older Plone or CPS versions.

I didn’t mention these manual steps in the documentation. I think I am the only person in the world interested in running this tool on the dead-but-still-in-production-in-many-places Nuxeo CPS.

By the way: kudos goes to Marc-Aurèle Darche, who is maintaining CPS for years now, making it one of the most bug-free and stable CMS solution out there.  Ok it’s probably easier to reach this level of quality since the platform is very stable and only evolves very slowly thanks to Georges Racinet.

Décembre 16, 2008
» Pycon 2009 talks


I have 2 accepted talks at Pycon, that is great. I would like to say that the Pycon review system is awesome because you can see what the reviewers have said, and understand why your talk was accepted or declined.

I was a bit frustrated that my Atomisator talk was declined, but I think it makes sense : this is a new tool, and beside my user group and a few people, it is not really used yet.

One reviewer said that it had to be picked, and another one answered :

I agree that PyCon should not restrict itself to well-known projects, but it should definitely restrict itself to projects that are (a) in production use, (b) under active development, and (c) likely to still be so in a year. There are so many projects meeting these criteria that for me, the bar is very high indeed to spend a talk slot on one that does not.

Ok, fair enough : I will present this talk at Pycon 2010 and they won’t have any argument to decline it ;)

The talks that made it:

  • How AlterWay releases web applications using zc.buildout
  • On the importance of PyPI in delivering and building Python softwares - mirroring, fail-over and third-party package indexes

I will get into greater details later on.

Décembre 15, 2008
» Python Isolated Environment (PIE)


Here’s a proposal I will send to the python-dev. What do you think ?

(Disclaimer : this proposal is highly inspired from the work done by people in various tools, it does not reinvent anything)

The problem

Python developers distribute and deploy their packages using myriads of dependencies. Some of them are not yet available as official OS python packages.  Even sometimes one package conflicts with the official version of a package installed in a given OS.

In any case, the cycle of development of most Python applications is shorter than the release cycle of Linux distributions, so it is impossible for application Foo to wait that Bar 5.6 is officialy available in Debian 4.x.

Therefore, there’s a need to provide or describe a specific list of dependencies for their application to work.

And this list of dependency might conflict with the existing list of packages installed in Python. In other words, even if this is not a wanted behavior from an os packager point of view, an application might need to provide its own execution context.

Right now, when Python is loaded, it uses the site module to browse the site-packages directory to populate the path with packages it find there.  .pth files are also parsed to provide extra paths.

Python 2.6 has introduced per-user site-packages directory, where you can define an extra directory, which is added in the path like the central one.

But both will append new paths to the environment without any rule of exclusion or version checking.

The workarounds

A few workarounds exist to be able to express what packages (and version) an application needs to run, or to set up an isolated environment for it:

  • setuptools provides the install_requires mechanism where you can define dependencies directly inside the package, as a new metadata. It also provides a way to install two different versions of one package and let you pick by code or when the program starts, which one you want to activate.
  • virtualenv will let you create an isolated Python environment, where you can define your own site-packages. This allows you to make sure you are not conflicting with a incompatible version of a given package.
  • zc.buildout relies on setuptools and provides an isolated environment a bit similar in some aspects to virtualenv.
  • pip provides a way to describe requirements in a file, which can be used to define bundles, which are very similar to what zc.buildout provides.

But they all aim at the same goal : define a specific execution context for a specific application, and declare dependencies with no respect to other applications or to the OS environment.

This proposal describes a solution that can be added to Python to provide that feature.

The solution

A isolated environment file that describes dependencies is added. This file can be tweaked by the application packager, or later by the OS packager if something goes wrong.

The isolated environment file

A new file called a  Python Isolated Environment file (PIE file) can be provided by any  application to define the list of dependencies and their versions.

It is a simple text file with a first line that provides :

  • a list of paths, separated by ‘:’, on line 1
  • then one package per line, starting at line 2. each package can be prefixed by a `!`

For example:

/var/myapp/myenv
lxml
sqlite
sqlalchemy
!sqlobject

This list of packages might or might not be installed in Python.

Versions can be provided as well in this file :

/var/myapp/myenv:/var/myapp/myenv2
lxml >= 0.9
sqlite > 1.8
sqlalchemy == 0.7
!sqlobject == 0.6

The file is saved with the pie extension,

Loading an isolated environment file

A new function called load_isolated_environment is added in site.py, that let you load a PIE file.

Loading a PIE file means:

  • for each package defined, starting at line 2, load_isolated_environment will look into the environment if the package with the particular version exists. The version is given by the package.__version__ value or the PKG-INFO one when available. If the package exists but the version is not available, the version 0.0 is used.
  • for packages without the ! prefix:
    • if the  package is not found, it will scan each path provided on line 1 of the file, using the site-packages method, looking for that package.
    • if the package is found, it is added in the path.
    • if the package is not found, a PackageMissing error is raised.
  • for packages starting with the ! prefix:
    • if the  package is found, it is removed from the path

This function can be called by code like this:

>>> from site import load_isolated_environment
>>> load_isolated_environment('/path/to/context.pie')

From there, sys.path meets the requirements and the code that is executed after this call will benefit from this context.
Another context can be loaded in the same process :

>>> load_isolated_environment('/path/to/another_context.pie')

Limitations:

  • if the new context brakes other programs in the process. It’s up to the application packager to fix the context file.
  • it’s not the job of load_isolated_environment to resolve dependencies issues : if the foo package needs the bar package, it won’t complain.
  • it is not the job of load_isolated_environment to get missing dependencies.

Using an isolated environment file

Typically, an isolated environment file can be used into high-level Python scripts. For example, any script an application provides to be launched :

# this script runs zope
from site import load_isolated_environment
load_isolated_environment('zope-3.4.pie')

import zope

if __name__ == '__main__':
    zope.run()

Décembre 12, 2008
» Pycon 2009 proposals


The proposal acceptance date is in a few days.

Here are the four proposals I have made:

  • The state of packaging in Python. This discussion resumes the current options when it comes to distribute your packages. It also explains the pitfalls and the gap between the Python developers and the OS vendors and packagers. I think this talk will not be picked because the topic is wide and vague. So I proposed to transform it into a panel where lead developers from various framework could explain their usage of distutils and what is missing to make them happy. No feedback yet on this.
  • Atomisator, the agile data processing framework. This tool is starting to be useful, and I think it can be useful to others. Check http://atomisator.ziade.org for a quick overview.
  • How AlterWay releases web applications using zc.buildout. That is the same talk I gave at the Plone conf but I present it in a way people understand zc.buildout is not tied to Zope and Plone and can be used with any other application. As a matter of fact, it has become a standard here, and we use it for Pylons, etc..
  • On the importance of PyPI in delivering and building Python softwares - mirroring, fail-over and third-party package indexes. That’s a long title. It presents my work on PyPI.

Last, I will go to the Python Language Summit the day before Pycon. I volunteered to be a “champion” on distutils matters.

      

Novembre 27, 2008
» Expert Python Programming Book : typo sprint tonight !


I love Packt. As soon as I have told them that some people liked the book but complained about the typos, they proposed to go ahead and launch a new print cycle.

Basically it means that the next buyers will have a typo-free book. At least for all the typos that were reported on my Trac here, or at Packt’s.

I am currently processing all the typos reported at Packt so I have a full list on my wiki, and will provide them the final list tomorrow.

Then, they will re-print it.

So if you already own the book, and you see a typo that is not listed, please let me know.

      

Novembre 26, 2008
» Python package distribution - my current work


I found a bit of time to work on distribution matters. Here’s a status of what I am doing there.

There are two topics I am focusing on right now.

  • clean up and enhance Python’s distutils package
  • implement the mirroring infrastructure at PyPI

distutils work

Nathan Van Gheem proposed a cool patch in collective.dist, (this package is a port of the new features I have added in distutils so they are available in 2.4 and 2.5).

Nathan proposed a patch to be able to avoid the storage of the password in the .pypirc file. The prompt is used in that case. This is something that was in my pile for a long time.

I have added a few things to Nathan’s patch, and a test, and proposed it to Python. I am now waiting for its integration in 2.7 trunk: http://bugs.python.org/issue4394. If it’s accepted, I will backport it to collective.dist.

There are some other tickets I am waiting to be accepted:

I am not sure when those will be integrated. The average time for the integration of tickets in distutils in Python is between 6 months and 8 months. hihihi. :D

PyPI mirroring

The job I am doing in PyPI will be in three phase :

  • Phase 1: implement the mirroring infrastructure in PyPI
  • Phase 2: promote it, and propose patches for the mirroring tools out there so they use the protocol
  • Phase 3: promote and propose patches for pip so it can use the mirrors efficiently (fail-over and nearest mirror infrastructure).

Phase 1: so far, so good.

With some insights from Richard Jones and Martin von Löwis, I am currently implementing the mirroring infrastructure for PyPI we have defined during the D.C. sprint (I still owe a blog entry about this sprint). The code lives in a branch on the python svn folder dedicated to PyPI.

The idea of the mirroring infrastructure is to be able to get a list of official mirrors for PyPI, that can be used as alternatives sources . (It is described here: http://wiki.python.org/moin/PEP_374). A great behavior could be that the client application interacts with the nearest mirror location automatically, and switch to another if it goes down.

So, a list of mirrors will be made available at /mirrors, and the client applications will be able from there to use an alternative location for every package. The hardest part concerns the stats : we want to display in PyPI the download counts for each package by summing downloads from every mirror.

So every mirror will have to provide its “local stats” that can be visited by PyPI. That’s the biggest part of the work I am doing. It will build the stats for PyPI by parsing its Apache log file. And hopefully, this code should be reusable by the mirrors themselve so they can build their stats the same way.

Of course this infrastructure could be used for any PyPI-compatible server even if is not a mirror of PyPI (like a private PyPI server)

Phase 2 will consist in promoting the infrastructure to the mirroring softwares out there. Maybe Pycon will be a good place for that.

Phase 3 is the most interesting one : make sure the client applications use the mirrors ! I think Ian Bicking’s pip project could be the right place for these innovations.

Next topics in the pile:

  • index-merging: describe in a PEP-like document the index-merging feature that would allow clients to merge several indexes with a content that differe. For example: PyPI + a private PyPI server. I have written a first draft of such a patch in setuptools in the past (http://bugs.python.org/setuptools/issue32) but I have lost all my hopes to see this project moving forward lately.
  • Brainstorming: try to understand the Python Packaging Paradox. That is = how come the community, which is composed of many briliant people, is unable to move forward in packaging matters.
  • Distribute the return :D
      

Novembre 22, 2008
» How to be disappointed with the “printed” in “printed book”


I feel really bad about this comment on my book : How To Be Dissappointed in Something You Recommend.

Just a quick word about the try, return finally code pattern, since I had some feedback about it. I would like to mention that this code pattern is perfectly right:

def function():
    try:
      return something
    finally:
      do something

I should have explained it better, because this pattern is not used a lot by people, so you can think that “do something” is called after the return of the function, which is not the case.

For the typos now:

The first thing I did wrong: when I started the book, I wanted, as I did in my previous book, to run unit tests on the book itself to avoid those mistakes. That said, the previous one was in Latex, which is quite simple to interact with, and this one is in OpenOffice, because that is how the editor works. I had to write a script to extract the Python code from the Ooo file, to unit test it. I didn’t. I simply ran out of time, as usual when you have deadlines on books.

The second thing I did wrong: I should have told the editor to wait a bit, I didn’t.

But Packt does Print On Demand, so I know that the Errata page I am maintaining here : http://atomisator.ziade.org/wiki/Errata, is being processed by the editor, and that the typos will be removed from the book at some point, without having to wait for a second edition.

I’ll update this blog entry as soon as I know the status on this.

I am really sorry Calvin, and all the people that are suffering from these typos.

      

Novembre 9, 2008
» How to receive email alerts when someone talks about something - 6 steps tutorial using Atomisator


I like Google Alert, the idea of receiving a mail every day that summarizes all articles related to a given topic is really helpfull when you need to focus on a specific subject for a while.

But this is not enough. I want to receive a mail that points me to any mailing list or planet feed or blogs out there as well, that talks about the topic.

You can’t do it with Google Alerts as far as I know.

Let’s take an example:

I want to receive a daily mail that points me to any mail thread or blog entry, that is related to the word “buildout” or to the word “pycon”.

Basically, to do it manually, I need to read Planet Python, Planet Zope, then take a look at the Python, Zope and Plone mailing lists. It takes at least 10 minutes, and more if you want to read all entries to make sure you won’t miss anything.

Since online systems like Nabble provides RSS feed for mailing lists (don’t find yours ? just add it there !), it is easy to read them as they where regular feeds.

From there, a script that reads all the selected feeds and sends a mail pointing to the entries that match the selected words is simple to write as well, and fill the need.

But don’t code it : Atomisator will let you do this with a few lines of configuration.

Here’s a step-by-step tutorial.

Step 1 - install easy_install

Step 2 - install Atomisator and SQLite

Step 3 - create an “atomisator.cfg” file

The content of the file has to be:

[atomisator]
store-entries = false

sources =
  rss http://www.nabble.com/Python---python-list-f2962.xml
  rss http://n2.nabble.com/Plone-f293351.xml
  rss http://www.nabble.com/Zope---General-f6715.xml
  rss http://planet.python.org/rss10.xml
  rss http://www.zope.org/Planet/planet_rss10.xml
filters =
  buzzwords words.txt
outputs =
  email email.cfg

This file will look into Planet Python, Planet Zope and various mailing lists (Python, Plone, Zope). Of course you can add or remove feeds in the sources option.

Step 4 - Create the words.txt file

This file contains regular expressions, one per line, that will be used to match the entries. The file has to be saved besides atomisator.cfg.

For our example:

buildout
pycon

You can put any expression you want in this file, as long as you have one matching expression per line.

Step 5 - add an email.cfg configuration file.

This is where you define the target emails that will receive the alerts (tos option). You can also specify the from email, or the smtp server location. The file has to be saved besides atomisator.cfg.

In our case it can be:

[email]
tos = tarek@ziade.org
from = tarek@ziade.org
smtp_server = smtp.neuf.fr

Step 6 - Run it !

The command to be called is atomisator (installed by easy_install) followed by the configuration file:

$ atomisator atomisator.cfg
Reading data.
Launching worker for rss - ('http://www.nabble.com/Python---python-list-f2962.xml',)
Launching worker for rss - ('http://n2.nabble.com/Plone-f293351.xml',)
Launching worker for rss - ('http://www.nabble.com/Zope---General-f6715.xml',)
Launching worker for rss - ('http://planet.python.org/rss10.xml',)
Launching worker for rss - ('http://www.zope.org/Planet/planet_rss10.xml',)
Retrieving from rss - ('http://www.nabble.com/Python---python-list-f2962.xml',)
Retrieving from rss - ('http://www.nabble.com/Zope---General-f6715.xml',)
Retrieving from rss - ('http://n2.nabble.com/Plone-f293351.xml',)
Retrieving from rss - ('http://planet.python.org/rss10.xml',)
Retrieving from rss - ('http://www.zope.org/Planet/planet_rss10.xml',)
.................................................................................................................................................
Writing outputs.
Data ready.

Check your mails. This call can be put in a daily cron.

Tested under Mac OS X and Linux.

      

Novembre 8, 2008
» Promoting Python and Plone in Africa

It seems that only South Africa had an event listed as part of the World Plone Day set of local events. As an african, I am of course interested by this fact, and I would have expected some Plone presence in another region. Hopefully, we can fix that for next year’s edition !

I am promoting an effort called Python African Tour which aims at sending volunteers, within the next couple of years, in the different regions of Africa, based on sponsoring, to train beginner developers there on Python and its related technologies. It’s a way to introduce newbies there to a programming language that helps the developer get his job done, as well as all the community practices that help us improve our daily work. It’s also a way to get new developers join the Python community.

The first country the tour will visit is Morocco, in December, from the 18th to the 22nd. Among the possible countries to plan in 2009 are Nigeria and South Africa. Obviously our plans will depend on discussions with local contacts we get, and sponsoring possibilities.

For Morocco, Amine Soulaymani, a developer living in Morocco, and Daniel Nouri have volunteered to participate as instructors for the students at Ecole Mohammedia d’Ingenieurs, ­the school that will host the Python training session.
In addition to the training session during the first 2 days, we plan to have 3 days of community activities: an unconference-style open event with demos and talks related to Python, followed by a sprint that will be hosted in the offices of Nextma, a solution provider doing Python. Talks and sprint activities should cover Plone, with the participation of a PloneGov / CommunesPlone team joining us from Belgium, WSGI / Repoze and ­OpenERP with contributors from Nextma.

On a side note, I have proposed a talk with Roberto Allende for next year’s Pycon to present our ideas and actions to help spread Python in both South America and Africa.

If you want to contribute in any way, ­­contact us through the project’s mailing list.


Novembre 6, 2008
» Plone Conference 2008 in Washington D.C. - summary


I am back from the Plone Conference in D.C., and the jetlag is gone. The jetlag is gone for weeks now but it’s hard to find the time to blog these days :/

On the talks I have seen and topics I have chatted about

There were a lot of great talks in D.C., and it was hard to decide which one to look at. In any case it was easy to meet the speaker if I had missed the talk, because the Plone Conference, unlike big conferences like OSCON, is a place where everyone hangs around the same spot after a talk is over.

Here’s a list of some topics I have seen or I have talked about with some people.

Deliverance - Ian Bicking

If you look at what Ian has produced in the past 5 years, he is one of the most prolific contributor of tools that become standards in the Python web development web community. Think about Python Paste or virtualenv, and many others. Deliverance might be the next big one.

Take a bunch of micro web applications you want to join to build a full web system, for historical reasons or just because you believe a particular feature just won’t fit in Plone but will do great in Pylons.

Now ask a designer to glue everything together under the same look. He (or the guy that integrates his design) will probably hates you: he will have to learn how to integrate in heterogeneous environments. This is easy under some systems that let you stick a layout and a css in a simple way. This is not easy under Plone, unless you learn how to do it (but this will be improved in the future).

Deliverance is a proxy that let you skin any application that spits html content, by running some XPATH rules on the content and applying some changes to produce a new output. Basically, you have a simple html page that just provides the layout you want to have, without any content, and a xml file that explains how to extract some content from the page produced by the third-party application and where to inject it in your empty html page. The great thing is that you can call different third-party servers given the path you are in, and even call several servers to build one single page. This opens a lot of perspectives.

The first caveat of this approach is that you have to provide a Single-Sign On feature to avoid people having to connect several times. This can be a problem sometimes with some applications if they are not open enough to let you do it. But most of the time, it is not a problem : if the users are all located in a LDAP it is easiy.

Furthermore, if you use only Python-based applications, you can use a WSGI envrionment and a middleware like repoze.who to glue together let’s say, a Plone app and a Pylons app. Products.oopas is the PAS plugin that can be used for that on Plone side to grab the authentication context and use it.

The second problem I can see is about response headers. One example: if a page is composed of elements that comes from several pages, and if the page has a Last-Modifier header, I don’t think Deliverance handles this correctly yet, to make sure to present the newest Last-Modified header from all third-party servers that where called to build that page. But this more likely to be a detail compared to the single authentication problem.

In any case this is a very promising tool !

Content Mirror - Kapil Thangavelu

I didn’t see that talk, but I have talked about this tool with a few people. The idea is to serialize the content of a Plone instance into a relational database (eg Postgresql), as it happens, using events.

I need to give a try and check it deeper, to see how the overhead is dealt, and how the aggregator I have read about is doing (it collects mirorring operations to perform in a transaction, and optimize the calls at the end of the transaction to avoid redudant calls if I understood correctly). I don’t know yet for example if there’s a pool of jobs for the mirroring tasks to avoid a point of failure. But I am pretty sure this is taking care of. The other point I need to see if there’s a round trip. e.g. if there’s a way to apply a relational database change back into Plone.

But in any case I can already see various use cases for my customers. For instance, having a plone instance as a back office, with complex workflows for editors and contributors, and a lightweight Pylons application as the front application, that concentrates into displaying the relational database as fast as possible, makes a lot of sense in big environments. It just scales better.

So this is a interesting tool as well.

repoze.bfg - Chris McDonough

Chris gave a talk about repoze.bfg, which is a new web framework that takes back the good bits from Zope and push them into a WSGI world, using the Pylons approach I would say. That is : “here’s the template engine you can use in repoze, but really, use the one you like”.

Frankly, I am really seeing this new effort as one of the most promising one in the Zope community. Already, repoze.auth is a major middleware in WSGI : Zope’s Pluggable Authentication Service outside Zope, usable with any WSGI application. This is a blast !

And people are starting to contribute a lot of interesting middlewares under the repoze namespace.

Now I didn’t really try repoze.bfg itself yet, but given the people that are behind it, I am pretty sure this framework will meet success in the future. Having a MVC framework ala Pylons that let you use Zope packages with a “this zope package is repoze/wsgi compliant” label on each one of them is very cool.

collective.indexing - Andreas Zeidler and al

At the snow sprint, we worked with the Enfold crew that did a great work in integrating the Solr/Lucene system so it can be used from Plone. We replaced a few fields like the searchable text and indexed it on Solr side, just to give it a try. The snow work was really focusing on providing a buildout, a few recipes and a bench to say : “Hey, Plone community, this is a blast ! let’s do more of it”

Later Andreas Zeidler and a few other guys continued the work on indexing matter and they delivered collective.indexing, which provides two things:

  • a queue that collects all indexing to be done, and optimize the call to the catalog
  • a bridge to use collective.solr

I didn’t follow the latest development and I didn’t know how far the guys went, but I had the chance to hang around with Andreas and Tom Lazar in D.C., so now I know that this package is production ready :D

So in other words : I’ll probably use it as a mandatory package for all the big plones out there.

The queuing part imho, should go into the catalog itself because there’s no other way to make sure a third-party product is not calling the catalog during the transaction wile another product does the same.

Server-Side Include (SSI)

Tom Lazar worked during the Snow Sprint on lovely.remoteinclude to make Plone portlets accessible via unique URLs. From there, it is possible to push a page that contains a list of urls rather than the calculated page, to a front server that knows how to read SSI directive, and builds the page.

This is great for performances, and is a lot like ESI (Edge Side Include) we use to have in CPSSkins.

I am wondering if both could be implemented in the same tool in fact.

Tom told me that he will try to continue this work at the performance sprint in Bristol in december, so let’s keep an eye on this !

I have seen many other talks and topics, but these few ones where the ones I really needed to talk about.

On the conference organization

I am helping in the organization of Pycon FR in Paris since 2 years now. I know what is means to organize such events : it is a LOT OF WORK.

You know when an event is well organized when you don’t feel it is organized.

That was the case in D.C. Bravo Alex, Amy and all the others !

The only problem (wifi) was not the organizers fault, and I have never been to any event where it is not cahotic at some point (besides OSCON) so… :)

On the community

I love you all guys. It is an amazing community.

      

Octobre 29, 2008
» PloneConf’08 slides + screencasts : delivering applications with zc.buildout and a distributed model


I was totally drowned into some customer projects since I came back from the Plone Conference. But things are looking better now, so I can take a bit of time to start blogging about the conference. I’ll probably do three blog posts: this one about my tutorial, the next one about the conference itself and last, an entry about the sprints.

So I gave a tutorial about zc.buildout. The length was a bit challenging, since I had 90 minutes. Enough time to explain more things than in a regular talk, but not enough time to get into great details, as a tutorial should be.

The other thing was about the topic: two talks were covering zc.buildout, Clayton Parker’s one and mine. So my goal was to make sure they were not overlapping too much.

I had the chance to meet Clayton before we both gave our talk, since the Six Feet Up crew gave me shelter in the house they rented (nicest guys in the block). Even if we didn’t exchange a lot on the slides themselves, I could figure what Clayton was going to present. So I…. started my slides from scratch two days before my tutorial and worked carefully on their scope :D

Alex Clark came and backed me up during the talk, since we are working together one plone.org for months now and since the talk presented the new plone.org that is coming up.

I think I did quite well during the talk, because we had a pause half way, and when we started back, the room stayed full ;)

This is the second time I record all the console work in small screencasts, to avoid live problems, and I think this is the best way to go if you need to do some demos, so I’ll keep on doing it. Plus, it’s nice to provide them to people after the talk.

Anyway, the talk was videotaped so you can to judge by yourself:

And everytime you see a “get the screencast at http://ziade.org/ploneconf” in there, well get them :)

In detail:

  1. Distutils demo
  2. setuptools demo
  3. collective.eggproxy demo
  4. PloneSoftwareCenter installation
  5. collective.dist demo
  6. new.plone.org demo
  7. collective.releaser demo
  8. plone 3 buildout demo
  9. Multiple target releasing demo

What’s next ?

  • We need to finish the work with Alex on plone.org. It’s not hard, it just takes time, and we both are quit busy in our jobs :)
  • I need to polish collective.releaser and collective.eggproxy. They are brut de fonderie, and the code suck a bit. If you are using them and want to help, or have some feedback/issues, please, pretty please, let me know.
      

Octobre 21, 2008
» 3 Colors Theme : a collective phantasy based theme

Since Plone base properties style could seems too rich for a standard use case, phantasy skin edit forms could be also too complex.

When i saw how easy it is to change some skin properties in a blog system like word press ( …),  i thought we need the same feature in Plone.

Collective Phantasy is done for that.

Building a new skin product with a standard static theme and a customized skin schema is something you can do easily with collective phantasy.

If you want an example, just test the « 3 colors theme » plone product.

In your buildout :

in instance eggs section add :

    collective.threecolorstheme

in zcml section add :

    collective.threecolorstheme
    collective.threecolorstheme-overrides

Relaunch your buildout.
Launch your zope instance.

In plone_control_panel you will get :

Check for « Three Colors Theme » Product, this will install the product and its dependencies :

Go back to the home page, of course the « static » theme has changed (my poor contribution to plone themes community)

We want to change the dynamic properties, so we click on « Contents » tab, and we can see that everybody is here : the dynamic root skin for portal, and the phantasy skins repository (read previous posts …) :

We want to build a new skin, so as we have seen in previous posts, in phantasy skins repository we add a new skin :

In this form you could see differences with a classic phantasy skin form (for developpers/integrators look at the code it’s just some easy Archetypes bidouille) :

- 3 colors only + mutators to change all colors

- an entire fieldset removed

- many fields unuseful in this theme are hidden

- some skin attributes added in schema

Go back to reality : just change some colors

Now we will import a skin sample provided in « three colors theme product ».

In alternate_skin folder you will find a zip file and a txt file (howtouseit.txt).

Click on « import images and files », choose the alternate_skin.zip file

Read howtouseit.txt and change some colors or properties

Use plone kss power to change images’ names quickly

CTRL-F5 to refresh the page, of course you need some design feeling to understand what’s happen here, but it’s not so complicated (…)

In the product you will find another example with another howtouseit.txt, here, then just add a folder and choose the good phantasy skin and you will get :

Loving Plone :-)


Octobre 13, 2008
» Using Collective phantasy with another plone theme product

When installing collective phantasy it will takes its skin properties from the current available plone theme.

You could see in screenshots below a customization of qPloneSkinWhiteBlack plone theme, using phantasy :