By Ingeniweb. A Django site.
Août 23, 2010
» Current job: Enter phase 2

I’ve now spent 10 months in my current job, and I owe to myself to stop today for a quick restrospective before moving to another phase.

The first period (phase 1) helped me understand the context and what can be done to bring value. Fortunately, a lot can be done, so the challenge is interesting. Also, I was lucky to quickly understand and accept that, in the current context, change could only happen progressively.

Saying that nothing interesting has happened so far with the projects I am in charge of, would not be true. But the whole process is too slow (or sometimes chaotic), for unnecessary reasons.

My phase 2 is going to involve more initiatives to pick up the pace on the projects and get things moving. Wish me good luck or my next related post will be about asking if someone has a job opportunity for me ;)

As for other news, I restarted developing for personal projects with my favourite CMS (which is preparing a new release). I wish all success to Plone 4 and the community. If you need a CMS for a critical/complex site project, Plone is undoubtly in the top 2 or 3. Also, we have a great community which opens to various technologies and development practices besides Plone such as Zope Toolkit, ZODB, Deliverance, deco.gs, and BFG, to only name a few.


Mai 23, 2010
» Python African Tour and beyond

While the Python African Tour (a.k.a. PAT) events are now underway, here is a wrap-up of some feedback and current thoughts about contributing through the organization of Python learning activities in Africa.

The New vs. the Old, again

When you talk to most people, you find that the Python language is underrated or new to them. C, Java, and PHP… it’s okay. But Python ?

Python could get a decent base of users in the medium to long term, if some influential guys in the university and entrepreneur spheres are aware of its advantages and decide to introduce it in their toolset or work environment.

Now, there is hope ; we’ve met guys from this new generation in Morocco, Senegal… and we are meeting others in Nigeria in a few weeks. In fact, these are the people who have been initiators and resource people in the process of organizing PAT events.

PAT Senegal team

PAT Senegal team - (c) DakarLUG

User Groups

The Senegal PAT event was hosted with the help of DakarLUG, a small but very active group of people spreading the Linux and Open Source message there (install parties, demos, and all that).

Seems we were lucky… From my perception, it is hard to find active local geek or learning communities, at least in the francophone world which I am more connected to.  Though this might be less true today, with the local barcamps, LUGs and Google Technology User Groups that have been launched these last two years.

Community is key to spreading Open Source ideas, practices and solutions, and if we can’t have local communities I am afraid this is going to be even more difficult.

We are helping to improve that situation ourselves, since in addition to the Python workshop, we have been organizing a camp day for a larger public, with presentations of interesting technologies and development practices.

Tools and applications, a.k.a. solutions

Ok, people want solutions… They have problems to solve.

While we want to not simply consume solutions, but be able to extend and hack around them, it helps a lot to quickly introduce interesting tools and applications (and platforms) to the people.

The current focus of PAT for this includes two domains where Python is really strong:

  • Web Frameworks (Django but not limited to). In the future, we should extend this with the “client” side of things (mobile, JavaScript, HTML5…)
  • Scientific tools (SciPy, NumPy)… for people coming from research departments.
Camp day - SciPy presentation by Emmanuelle

PAT Senegal / Emmanuelle Gouillart presenting Scientific tools - (c) DakarLUG

Sustaining and going forward

Last but not least, we need that students having participated to the workshop become contributors. This means they need to practice after the PAT event, so they need ongoing mentoring and small projects to work on.

One trick is to convince the head of the CS/Research department we meet in the universtity or school, to introduce Python in the curriculum, or allow students to use Python for their school projects. This gave encouraging results in Dakar.

An idea I have been nurturing is running our own kind of SoC program. This is the next phase, and more details coming soon.

Contact me if you want to share more thoughts, experiences, or contribute to any aspect of this.


Mai 3, 2010
» mutliprocessing for python2.4 on MacOSX 10.6.3

Hello,

Today I have some problems to install multiprocessing-2.6.2.1 on my mac (in dependance of collective.releaser) for python 2.4

So if you have the same problems , here goes some tips about that

first multiprocessing failed like that:

Modules/mmapmodule.c: In function 'new_mmap_object':
Modules/mmapmodule.c:947: warning: implicit declaration of function 'open'
Modules/mmapmodule.c:947: error: 'O_RDWR' undeclared (first use in this function)
Modules/mmapmodule.c:947: error: (Each undeclared identifier is reported only once
Modules/mmapmodule.c:947: error: for each function it appears in.)
error: Setup script exited with error: command '/usr/bin/gcc-4.2' failed with exit status 1

So I download multiprocessing by hand

$ wget http://pypi.python.org/packages/source/m/multiprocessing/multiprocessing-2.6.2.1.tar.gz

I edit Modules/mmapmodule.c and add

81 #include <fcntl.h>
The command
$ python setup.py install

succeed after that.

This bug is similar of this issue : http://bugs.python.org/issue3266

Regards Youenn.


Avril 29, 2010
» AllowedContentType in Plone2.5

Hello,

I notice that the method allowedContentTypes cost time when you have a lot of type in plone2.5.I don’t know if plone3 or plone4 are impacted of that. I have 120 types and the time of execution of context.allowedContentTypes is about 0.32sec.

The path of allowedContentTypes is :

allowedContentTypes -> portal_types.listTypeInfo -> for each content type: portal_types.isConstructionAllowed -> portal_types._queryFactoryMethod -> Products.Five.pythonproducts.patch_ProductDispatcher__bobo_traverse__ -> Products.Five.pythonproducts.product_packages

and Products.Five.pythonproducts.product_packages time call is 0,003s

When you have 100 content type only product_packages is responsible of 0,3sec , this method is a performance bottleneck.

So add this patch fix the problem :

from Products.Five import pythonproducts
old_product_packages = pythonproducts.product_packages
pythonproducts.product_packages =  forever.memoize(old_product_packages)

Regards Youenn


Avril 7, 2010
» New Release of iw.fss

I’ve just released a new version of iw.fss (2.8rc2 for plone 3) and FileSystemStorage (2.6.3 the same but for plone2.5).

Those releases change the behaviour of getData method which retrieve data from filesystem.
This change is important in case of big file because now all data are handled by filestream_iterator (in all case). So no memory is consumed when we access directly to the data (if you respect this new API)

For the developper point of view , getData works like OFS.Pdata. So to get all data (don’t do that please) just call str(field.get(instance)), and to get the first block of data call field.get(instance).data and you have a pointer to the next block in calling field.get(instance).next.

So if you are using iw.fss for yours projects please update it, our plone will thank you after that !!

Regards Youenn


Mars 20, 2010
» Asynchronous task with plone : easy !!

Hello,

Most of time in plone we do the work during the transaction. Or some task could be defered after it and better in an another client. collective.indexing do that for catalog indexing.
There is a simple solution to do that with Products.CMFSquidTool wich implements all hard work of assynchrone task.

So first import this:

from Products.CMFSquidTool.queue import Queue
from Products.CMFSquidTool.utils import pruneAsync

And add a new class that send http url after the transaction


class CallAsynchronous(Queue):
    """
    Sends requests on transaction commit
    """

    def _finish(self):
        # Process any pending url invalidations. This should *never*
        # fail.
        for url in self.urls():
           ## you can change this to post
            pruneAsync(url, purge_type='GET')
        # Empty urls queue for this thread
        self._reset()

    def queue(self, url_view):
            self.append(url_view)

call_utility = CallAsynchronous()

You add an instance in your code at the start of zope wich
And now in your code you can call assynchronous task like this

call_utility.queue('http://myzeoclient:8080/myplone/myview')

So at the end of the transaction url are called . You can by this method delegate some heavy operation to other zeoclient. That’s all !! easy no ?

Regards Youenn.


Mars 6, 2010
Gael Pasgrimaud
gawel
Gawel's blurb
» Using restkit proxy in your WSGI app

Here is my use case. A few days ago I've wrote an application to mirror my Flickr accounts. The pics are downloaded on file system and the metadata are stored in CouchDB. Now I use the Flickr interface to upload and tags pics but I got my own data on my own server. That's always cool.

Well done but now it can be useful to have a small web app to see the pics. Right ? That's what I've do. Since I love jQuery and CouchDB is full json compliant I don't want to write a complex app with tones of python code. So the idea is to have a small wsgi app to only serve static javascripts/html/css files and a proxy app to serve json by proxying CouchDB.

For now I'm using WSGIProxy. The code is very simple and look like this:

from wsgiproxy.exactproxy import proxy_exact_request
from webob import Request

class Proxy(object):
    def __init__(self, db=None, **kwargs):
        self.db = db
    def __call__(self, environ, start_response):
        req = Request(environ)
        if req.method == 'GET':
            req.server_name = '127.0.0.1'
            req.server_port = 5984
            req.script_name = ''
            req.path_info = '/%s%s' % (self.db, req.path_info)
            resp = req.get_response(proxy_exact_request)
            resp.content_type = 'text/jasacsript'
        else:
            resp = exc.HTTPForbidden()
        return resp(environ, start_response)


def make_app(global_conf, **local_conf):
    conf = global_conf.copy()
    conf.update(local_conf)
    return Proxy(**conf)

That's cool and simple and ok for small files. But if you want to handle large request WSGIProxy will raise a MemoryError. This is no longer fun. I've already sent a patch but don't want to bother the Paste team with that.

Then I've try to use restkit and thought that it can be the best library to wrote a proxy app. restkit manage a pool of http connection and handle large file request in a clean way.

The result is a small contribution. A set of WSGI applications included in the wsgi_proxy extention.

Here is a simple Paste config file to show own to use the proxies:

[server:main]
use = egg:Paste#http
port = 4969

[app:main]
use = egg:Paste#urlmap
/couchdb = couchdb
/ = proxy

[app:couchdb]
use = egg:restkit#host_proxy
uri = http://localhost:5984/mydb

[app:proxy]
use = egg:restkit#host_proxy
uri = http://benoitc.github.com/restkit/
max_connections=50
allowed_methods = get head post

No code needed. Cheers.

You can also use the Proxy class to proxify clients request and transform the response on the fly:

from webob import Request
from restkit.ext.wsgi_proxy import Proxy

proxy = Proxy()

def application(environ, start_response):
    req = Request(environ)
    req.environ['SERVER_NAME'] = 'example.com'
    req.environ['SERVER_PORT'] = '80'
    # do stuff
    ...
    resp = req.get_response(proxy)

    # do stuff ...
    ...

    return resp(environ, start_response)

I've also tried to replace WSGIProxy in deliverance with an ugly monkey patch:

from restkit.ext.wsgi_proxy import Proxy
from wsgiproxy import exactproxy
print 'Patching exactproxy with restkit proxy'
exactproxy.proxy_exact_request = Proxy(max_connections=4, allowed_methods=['GET', 'HEAD', 'POST']).__call__

It work perfect.

Mars 4, 2010
» Resizing and cropping thumbnails with jquery

Resizing and cropping images with jquery

Sometimes, we need images with fixed dimensions (carousels, photo albums …), but all images don’t have the good dimensions or good ratio height /width.

This little jquery script will help you for this usecase.

In this example we will fix the dimension and position of thumbnails placed in blocks using the class « imageContainer ». « imageContainer » have a fixed width and height (in this example 250px/150px). The size of thumbs will be increased or reduced, and the thumb will be moved inside its container to get a centered cropping. Of course images and its containers must have a relative position.

Your html template code looks like :

  <a class="imageContainer" href="the_link">
    <img src="the_image_src" alt="" width="450" height="300" />
  </a>
  <a class="imageContainer" href="another_link">
    <img src="another_image_src" alt="" width="200" height="100" />
  </a>

In your css, you must fix at least :

.imageContainer,
.imageContainer img {
  position: relative;
}

.imageContainer {
  width: 250px;
  height: 150px;
  overflow: hidden;
  border: 1px solid grey;
}

The javascript code is simple :

var canImproveResolution = false;

thumbResize = function(thumb, min_width, min_height, orientation) {
    twidth = jQuery(thumb).width();
    theight = jQuery(thumb).height();
    new_height = theight;
    new_width = twidth;
    // strange 1px bug on MSIE
    if (jQuery.browser.msie) {
        min_width = min_width+1;
        min_height = min_height+1;
    }
    //calculate the good size
    if (orientation=='landscape') {
        ratio = min_width/min_height;
        tratio = twidth/theight;
        if (tratio>ratio) {
            new_height = min_height;
            new_width = parseInt(new_height*tratio);
        }
        else {
            new_width = min_width;
            new_height = parseInt(new_width/tratio);
        }
    }
    else {
        ratio = min_height/min_width;
        tratio = theight/twidth;
        if (tratio>ratio) {
            new_width = min_width;
            new_height = parseInt(new_width*tratio);
        }
        else {
            new_height = min_height;
            new_width = parseInt(new_height/tratio);
        }
    }

    // resize thumb
    if (theight!=new_height || twidth!=new_width) {
        jQuery(thumb).height(new_height);
        jQuery(thumb).width(new_width);
    }
    // adjust position (vertical and horizontal centering for css cropping)
    if (new_width > min_width) {
        moveleft = parseInt((new_width-min_width)/2);
        jQuery(thumb).css('left', '-'+moveleft+'px');
    }
    if (new_height > min_height) {
        movetop = parseInt((new_height-min_height)/2);
        jQuery(thumb).css('top', '-'+movetop+'px');
    }
    // improve resolution
    if (canImproveResolution && (new_width > twidth || new_height > theight)) improveResolution(thumb);
}

resizeThumbs = function(thumbs, min_width, min_height) {
    orientation = (min_height > min_width) ? 'portrait' : 'landscape';
    // hide images before resizing to avoid bad visual effect
    thumbs.each( function() {
            jQuery(this).css('visibility', 'hidden');
        })
    // use window.load because on document.ready images are not always loaded
    jQuery(window).load(function() {
        thumbs.each( function() {
            thumbResize(this, min_width, min_height, orientation );
            jQuery(this).css('visibility', 'visible');
        })
    });
}

// change image src for a better resolution
// this is just an example for Plone, adapt it with
// your own sizes rules

improveResolution = function(img) {
    img_src = img.src;
    img_src = img_src.replace(/\/image_preview/gi, '/image');
    img_src = img_src.replace(/\/image_mini/gi, '/image_preview');
    img_src = img_src.replace(/\/image_thumb/gi, '/image_mini');
    img.src = img_src;
}

For Plone users, there is a small improvement, if you fix the variable :

canImproveResolution = true;

you will get a better resolution when increasing image size  (works only with standard plone image thumb sizes …).

At last launch the script for the wanted thumbnails :

jQuery(document).ready(function(){
  resizeThumbs(jQuery('.imageContainer img'), 250, 150)});

Février 6, 2010
» Plone and multilingual sites

Usually we build multilingual Plone sites with LinguaPlone.

This solution has a big advantage, it’s generic and very easy to implement in a plone site.

But there are many inconvenients, due to the design of this product (translations are independent by design) :

  • Each translation is a new Archetype object, and it could be a big problem on sites with many contents, the portal objects number is increased by the number of available languages.
  • Translations uses plone references catalog to be linked to the original (called canonical) object, but when moving objects translations are not moved, when copying pasting objects, translations are not pasted, when deleting objects translations are not deleted, when reordering contents, translations are not reordered, when publishing objects translations are not published, … For web masters maintaining a site  with LinguaPlone inside could be a challenge.
  • When translating folders with LP, all translated contents are moved from the canonical folder to the translated folder with same language, translating low level folders on big depth tree sites could take a long long time, don’t be surprise if you get errors.
  • If a content is neutral (with no language attribute), inside a translated folder, it could not be seen when browsing a translation of the parent folder.
  • At last a lower problem :  the translation edit forms are not pretty to use, they show a table with two columns, the first column with the « canonical » content inside in « view » mode, the second column with the translation edit form, in fixed width sites the translated form width is sometimes ridiculously small.

But since many years we use LinguaPlone because it was the only easy way to make multilingual Plone sites.

By the past we were using a LP patch called LinguaFace to reduce the number of problems with LP (synchronisation on reorder, copy-paste, delete, or move – see neutral contents inside all translated folders – more usable translation edit forms …), but LF add a new layer of complexity and maintaining it with all LP versions becomes complicated. See some examples on how it works :

  • when a content is copied, all translations are copied
  • when a content is pasted or moved all translations are pasted or moved at the good place (not so easy)
  • when a content is published or retracted translations follow the same workflow transition
  • when a folder is translated, we don’t see only objects inside but also objects with same canonical Path (a new catalog index) to see also neutral contents.
  • Navtree is patched, breadcrumbs are patched, to use canonical path
  • and so on …

A big nightmare.

To day a new solution exists that store translations inside each Archetypes field, raptus.multilingualfields, and a Plone integration of raptus.multilanguagefields called raptus.multilingualplone that extends the schema of all Plone Content Types making them translatable. raptus.multilingualfields also provides multilingual catalog indexes that return the good translated data when searching for contents or displaying trees, and multilingual criterions for topics.

A LP feature not provided by raptus.multilingualfields is the internationalization of urls, if you really need this feature, i think it’s not a big challenge to add some traversal rules, for me it’s not essential.

AnotherLP  feature that can’t be provided when storing translations in fields is to get different workflows or security settings for each translation. If you need this feature use LinguaPlone, LinguaPlone is done for that, it makes all translated contents independent, but i’m curious to know the number of users who really want this feature, all clients i had never ask for that but finally, after some LP experience, always wanted the exact opposite use-case.

To make your Plone site translatable with raptus.multilanguagefields, you have two choices :

  1. Add raptus.multilanguageplone in your buildout and install it in your Plone Site using extensions products control panel, to make all your Plone content types and derived translatable (all fields for which translation make a sense are translatable).
  2. I prefer integrate by myself raptus.multilanguagefields inside a product, since we could want just some fields or content types translatable, as example i don’t need to translate the images or the files contents, just their titles and descriptions.

How to implement the second solution ?

Just take a look at raptus.multilanguageplone code, it’s easy :

  1. Make your archetype extenders to make your wanted fields translatable, example can be found here
  2. register your extenders in setuphandlers (Generic Setup import step), example here
  3. replace the standard catalog  indexes with multilingual indexes in Generic Setup profile, example here

That’s all, you will get superb edit forms with translatable fields inside, you also will get a google help to translate contents (a pleasant gadget). I say « Bravo » to raptus developpers.

Important :

  • these products are young, and there’s still many work todo to make it work without problems (tests are needed …). Last 0.6 releases have bugs under plone3.3, use the svn versions below instead.
  • raptus.multilanguageplone 0.6  has a bug in extenders with primary fields, tested in Plone 3.3  (fixed in branch aws_evols, not tested with images at this time)
  • raptus.multilanguagefields 0.6 has a bug, doctests are broken in Plone3.3 (fixed in trunk )
  • At this time these products don’t have unit tests or functional tests, it’s the only reproach i can make.  I started the work  here, and here

Février 4, 2010
Gael Pasgrimaud
gawel
Gawel's blurb
» Avoid CSRF in the Ajax world

Most (good) web frameworks have a way to avoid Cross Site Request Forgery but in the Ajax world it's not so easy.

Here is my solution. First generate a secret key (sha1 of the current date or whatever) store it in user's session and show an hidden field in the requested webpage:

<input type="hidden" id="_req" name="_req" value="your secret key" />

Here is a snippet for Pylons:

def secure_field(with_id=False):
    value = request.environ.get('_req', None)
    if value is None:
        # generate a key if not already done for the request
        value = sha.new('%s-%s' % (datetime.now(), random.random())).hexdigest()
        session['_req'] = value
        request.environ['_req'] = value
        log.info('setting secure key to %r', value)
    if with_id:
        return hidden('_req', id='_req', value=value)
    return hidden('_req', value=value)

Notice that the field can be render multiple time with the same key (but only one with an id because of XHTML). The key is generated per request. This allow to have multiple _req fields in the same page so you can also add it to non-Ajax forms (and secure them too).

Then you need to POST this key on each Ajax request. Here is a wrapper for jQuery's post method:

post: function(url, data, callback, dataType) {
    // wrap $.post to add _req field
    if (!dataType) dataType = 'html';
    if (typeof(data) == typeof('')) {
        // $(form).serialize() return a string
        data += '&_req='+$('#_req').val();
    } else {
        data['_req'] = $('#_req').val();
    }
    $.post(url, data, callback, dataType);
}

The _req key is added to each POST request.

Last thing. You need to check that the key stored in user's session is also in the POST data. Here two Pylons decorators to avoid illegal requests:

@decorator
def secure_post(func, *args, **kwargs):
    """return html"""
    if request.method == 'POST':
        _req = session.get('_req', None)
        if _req is not None and _req == request.POST.get('_req'):
            del request.POST['_req']
            data = func(*args, **kwargs)
            return data
    if request.environ.get('paste.testing') is True:
        return func(*args, **kwargs)
    return _('Forbidden')

@decorator
def secure_json(func, *args, **kwargs):
    """return json"""
    if request.method == 'POST':
        _req = session.get('_req', None)
        if _req is not None and _req == request.POST.get('_req'):
            del request.POST['_req']
            data = func(*args, **kwargs)
    if request.environ.get('paste.testing') is True:
        data = func(*args, **kwargs)
    else:
        data = dict(error=_('Forbidden'))
    response.content_type = 'application/json'
    return json.dumps(data)

That's it. Now you are sure that all Ajax requests came from the user's web page. Cheers.

Of course the key is valid for more than one Ajax request. You may regenerate it for each main html page. May be this can be improved to change the key for each request including Ajax's but... It's already secure. Right ?

Janvier 12, 2010
» DateTime against mx.DateTime

I notice that in a zope application that I have to maintain is slow because of DateTime class.
The profile in this application test give the top time to this class.
So I want to test an other implementation which is name mx.DateTime. The difference is that mx.DateTime is writen in C.
So in a terminal , I install the two eggs via easy install :

./bin/easy_install DateTime
./bin/easy_install egenix-mx-base

And do a little script for testing the two api:

import sys
from mx import DateTime as mxDateTime
from DateTime import DateTime
from datetime import datetime
from time import time

def create_mxdatetime():
    return mxDateTime.now()

def create_zopedatetime():
    return DateTime()

def create_datetime():
    return datetime.now()

def bf(f, i):
    t1 = time()
    for i in xrange(i):
        f()
    t2 = time()
    print "bench for %s is %s" % (str(f), t2 - t1)

bf(create_mxdatetime, int(sys.argv[1]))
bf(create_zopedatetime, int(sys.argv[1]))
bf(create_datetime, int(sys.argv[1]))

This script just create Date in the three implementation : zope, mx and the standard library

And the results:

bash-3.2$ bin/python bench.py 1000
bench for function create_mxdatetime at 0x7db70 is 0.00120091438293
bench for function create_zopedatetime at 0x4215f0 is 0.84446310997
bench for function create_datetime at 0x4214b0 is 0.00220394134521
bash-3.2$ bin/python bench.py 10000
bench for function create_mxdatetime at 0x7db70 is 0.0117778778076
bench for function create_zopedatetime at 0x4215f0 is 8.81699991226
bench for function create_datetime at 0x4214b0 is 0.041069984436
bash-3.2$ bin/python bench.py 100000
bench for function create_mxdatetime at 0x7db70 is 0.11746096611
bench for function create_zopedatetime at 0x4215f0 is 87.8845770359
bench for function create_datetime at 0x4214b0 is 0.222129106522

No comment.. and in memory ?

So

bash-3.2$ bin/easy_install pympler

and now::

>>> from pympler import asizeof
>>> from datetime import datetime
>>> from mx import DateTime as mxDateTime
>>> from DateTime import DateTime
>>> asizeof.asizesof(DateTime() , mxDateTime.now(), datetime.now())
(1760, 56, 32)

Hoaaa !! a zope DateTime is 798 time slower than mxDateTime and it consume 31 more space than mxDateTime

I think ReplacingDateTime could be a good performance issue for zope, no ?

With DateTimeNG (zope DateTime with mx.DateTime) performance is 10 time better than DateTime. Memory consume is the same

From mx.DateTime documentation
Comparing the types to time-module based routines is not really possible,
since the used strategies differ. You can compare them to tuple-based
date/time classes though: DateTime[Delta] are much faster on creation, use
less storage and are faster to convert to the supported other formats than
any equivalent tuple-based implementation written in Python.
Creation of time-module values using time.mktime() is much slower than
doing the same thing with DateTime(). The same holds for the reverse
conversion (using time.localtime()).
The storage size of ticks (floats, which the time module uses) is about 1/3
of the size a DateTime instance uses. This is mainly due to the fact that
DateTime instances cache the broken down values for fast access.
To summarize: DateTime[Delta] are faster, but also use more memory than
traditional time-module based techniques
.


Décembre 23, 2009
» Migrer vos vieux sites vers Plone 3 ou Plone 4


Vous êtes nombreux à avoir ce souci, et c’est pareil pour moi, de migrer vos anciens sites CPS ou Plone2 vers une technologie plus mature et plus pérenne, à savoir Plone3 ou Plone4.

Ce billet n’a pas la prétention de résoudre ces questions avec un framework sophistiqué et néanmoins intéressant appelé Products.contentmigration mais plus humblement de vous donner quelques astuces bien pratiques.

Par exemple vous ne savez pas comment supprimer ces maudites Access Rules, qui, c’est pas faute de vous avoir prévenu étaient dangereuses, alors c’est facile, vous avez besoin d’un petit import :

from ZPublisher.BeforeTraverse import unregisterBeforeTraverse

et sur chaque objet migré :

rules = unregisterBeforeTraverse(obj, 'AccessRule')
if rules:
     try: del getattr(obj, rules[0].name).icon
     except: pass

Passons à une chose stupide mais lourde de conséquences, lorsque vous migrez des contenus d’un meta_type vers un autre, la plupart du temps il faut copier-coller l’objet et lui ré-attribuer tous ses anciens attributs, mais supposons que l’objet est structurellement le même ou presque, pourquoi s’embêter (imaginez dans le cas d’un dossier à la racine d’un site comme ça peut faire mal), il suffit de :

obj.__class__ = new_class

vous êtes assez malins pour trouver new_class et new_portal_type…

et dans le cas d’un Archetypes object à l’arrivée

obj.updateSchema()

C’est un exemple à ne pas suivre, mais comme tous les exemples de ce type il est bien pratique (ça se compte en jours/semaines de prise de tête en moins)

Sans doute qu’il y aura une suite à ce ticket.

Décembre 9, 2009
» How to add a counter without conflict error in zope ?

I notice that in charge , counter in Zope2 can generate some conflict error.

Why ?

Because two thread want to change the value of a variable. Conflict error are exposed here :
http://wiki.zope.org/zope2/ConflictErrors

But in certain case it’s useful to have an global counter that increment in certain operation. Cache Fu have those counter for caching purpose. Or in charge those counter generate some Conflict Error. There is an solution : resolve the conflict by hand.

In zope source I notice that there is an class that implement this use case. I try to add an counter with this implementation. And I test two implementation of the two counter : one wich was simply an int and one wich was Products.Transience.Transience.Increaser.

The init code of the first counter looks like that :

tool.counter = Products.Transience.Transience.Increaser(0)

The second counter looks like that:

tool.counter2 = 0

I test under siege the incrementation of the two implementation:

First test (Increaser)

Transactions:		         200 hits
Availability:		      100.00 %
Elapsed time:		        6.68 secs
Data transferred:	        0.00 MB
Response time:		        0.10 secs
Transaction rate:	       29.94 trans/sec
Throughput:		        0.00 MB/sec
Concurrency:		        2.97
Successful transactions:         200
Failed transactions:	           0
Longest transaction:	        0.44
Shortest transaction:	        0.01

O conflicts errors

Second test (int)

## 20 user

Transactions:		         200 hits
Availability:		      100.00 %
Elapsed time:		       11.35 secs
Data transferred:	        0.06 MB
Response time:		        0.26 secs
Transaction rate:	       17.62 trans/sec
Throughput:		        0.00 MB/sec
Concurrency:		        4.56
Successful transactions:         197
Failed transactions:	           0
Longest transaction:	        6.09
Shortest transaction:	        0.01

49 conflicts (3 unresolved)


Novembre 21, 2009
» Jumped in a new job and contemplating new challenges

I have recently quit Ingeniweb for a job in an international organization involved in the promotion of cultural diversity and sustainable development, to only mention two of its activity fields.

I am in charge of website projects (in the context of the Communication department). There is a lot to do, the first thing being understanding the context, the way things work here, and the people’s expectations.

As for the tools and technologies side of things, there is not currently anything Python-based here. PHP (SPIP) rules ! Well, I guess it’s perceived as easy and more importantly, skills exist everywhere (me looking in the direction of “web agencies”).
So, one of the interesting things will be to introduce Python, where it makes sense, in the not-too-far (hopefully) future. Anyway, I am confident that will happen one day. After all, it’s not always bad to be ahead of the time. You just have to wait, monitor, and talk to people everytime there is an opportunity. And you might be lucky and convince those who take the time to listen ;)


Novembre 13, 2009
» Memory Profiler for zope

I just release a little tool to detect Memory Leak in zope2 call Products.MemoryProfiler .

It use heapy (http://guppy-pe.sourceforge.net/#Heapy) in internal. It’s just an interface to this tool.

It provide an http interface in zope control panel to see the current memory .

When you start profiling, you take an snapshot of the memory at instant t.
When you click to updateSnapshot, memory profiling  tell you what objects are added between the start and the updateSnashot click. It will be usefull to detect Memory Leak.
Each snapshot is store (as string) in MemoryProfiler to be consult later (link to the date).

The button clear db cache clear all zeo cache of all mounting point so you can see the impact of the memory of those cache.

For windows users, you must compile guppy. There is egg for python 2.6 but
no for python 2.4. I have fatal error with Mingw to compile guppy. I hope that we have soon a binary egg to for python 2.4.

I hope that this tool give to us usefull  information to the memory consume by zope.

memory detail

Octobre 11, 2009
Gael Pasgrimaud
gawel
Gawel's blurb
» FormAlchemy 1.3 status

FormAlchemy 1.3 is released. From the website:

FormAlchemy eliminates boilerplate by autogenerating HTML input fields from a
given model. FormAlchemy will try to figure out what kind of HTML code should
be returned by introspecting the model's properties and generate ready-to-use
HTML code that will fit the developer's application.

Why I choose FormAlchemy instead of another form library ? Everybody knows that explicit is better than implicit. So most of form libraries use a schema to define how widgets are render. FormAlchemy avoid that by using the schema of the data model (SQLAlchemy mappers at the origin). So it generate forms implicitly using your explicit data model.

Another reason is that FormAlchemy is independent. This mean that you can use it in Pylons, Django, repoze.bfg, bobo and (put your favorite framework here).

Thats cool. But we can do more. And that's what we tried to do. SQLAlchemy is not the only library to define data model. So let use the others !

Using couchdbkit

couchdbkit allow to define a schema to store data in CouchDB. CouchDB is a project of the Apache foundation emerging as one of the good modern non-sql database solutions. Let's define a Pet document using couchdbkit:

>>> from formalchemy.ext import couchdb
>>> from couchdbkit import schema
>>> class Pet(couchdb.Document):
...     name = schema.StringProperty()

What about the form ? Here it is:

>>> fs = FieldSet(Pet)
>>> fs.bind(Pet())
>>> print fs.render()

So easy.

Why couchdbkit and not couchdb-python ? Don't know. I guess it's doable with couchdb-python too. The only reason is the same as Jean Sarkosy's potential election at the Epad's presidency. I know Nicolas, HAHA. No. I know Benoît Chesneau (aka benoitc). Benoît release some good stuff related to CouchDB both in python and erlang. Have a look to his bitbucket account.

Using zope.schema

I came from the zope world so I know zope.schema. Most python coders are afraid by the zope word. But they are wrong. At this time we can say that zope is no longer a framework but a set of well tested and well documented libraries. So let's define a small schema:

>>> from zope import interface
>>> from zope import schema

>>> class IPet(interface.Interface):
...     name = schema.TextLine(u'name')

Now we need an object to store values:

>>> class Pet(object):
...     interface.implements(IPet)

Let's use FormAlchemy to render a form for this pet:

>>> from formalchemy.ext.zope import FieldSet
>>> fs = FieldSet(IPet)
>>> fs = fs.bind(Pet())
>>> print fs.render()

That's it. We (at Alterway) use it in a customer project based on repoze.bfg and zope's ZODB as backend. It just work.

Using RDFAlchemy

RDFAlchemy define schemas to describe a RDF node. I know really nothing about RDF but implementing a FormAlchemy extension to support RDFAlchemy was easy so it's now in FormAlchemy. It's tagged as experimental but it work AFAIK.

One of the first implementation I like to have is a FOAF profile editor using FormAlchemy's RESTController (see bellow). I don't know how hard it can be but this can be awesome.

Pylons CRUD interface

I love Pylons. Just because it's simple and have full WSGI support. FormAlchemy have a pylons extensions for a while to generate an admin UI ala Django. This was cool but not perfect. I've added a new module in FA's pylons extension to allow to generate RESTFul CRUD interface based on a data model. Data model mean all models supported by FormAlchemy.

At this time I assume that this work with SQLAlchemy and couchdbkit's models. So I guess it's also usable with RDF stuff. I will try soon.

There is two controllers: RESTController and ModelsController. RESTController render a CRUD interface for a single model. ModelsController render an admin UI for all models found.

This is highly customisable. You just need to change one template.

Have a look at the documentation to read more about that.

fa.jquery

fa.jquery is a standalone package that provide a set of widgets based on jquery.ui. Have a look at the demo page. It also have a plugin registry to allow you to write your own widgets with a few lines of javascript.

You can change the default jquery.ui theme (redmond) and use your own. jquery.ui provide a theme editor

Shabti

Shabti is a set of pylons templates initiated by Graham Higgins. There is now a FormAlchemy template in Shabti to quick initialize a pylons project with FormAlchemy and fa.jquery. The template also initialize a CRUD interface in the admin controller for you.

Here is some screenshots of the admin UI using fa.jquery:

User listing:

http://www.gawel.org/thumbs/blog/shabti_users.png

User edit form:

http://www.gawel.org/thumbs/blog/shabti_user_edit.png

Notice that at this time Shabti require pylons-dev.

An now...

So what future for FormAlchemy ? This is not discussed yet but I like to reduce the amount of code. This mean using external dependencies.

FormAlchemy have a helpers.py module which mostly came from WebHelpers so I guess we can remove it and use WebHelpers instead.

FormAlchemy's validation stuff is simple but FormEncode is popular and powerful so it can be a good thing to use it for validation stuff.

I also like to add some pylons related stuff in fa.jquery to improve the CRUD interface. For example adding some ajax stuff to allow to add new record from a relation widget and maybe inline editing.

Last thing. At this time I'm the only contributor. Alex and Jonathan don't have time to contribute or are involved in other projects. If you like to contribute you can fork the FormAlchemy repository on bitbucket and use the pull request feature to submit patches. (Google code still to be the official repository) But please, FormAlchemy is a well tested library so run the tests before patch submission and be sure that your changes will don't break anything. Well tested patches are always welcome. Thanks !

That's it for now. Hope you're enjoyed it.

Août 31, 2009
» How to configure an custom vary tag for squid

You want to make an authenticated cache with apache/squid-varnish/plone and you don’t
know how do that : it’s possible with the vary tag.

Vary header tell to proxy cache what’s headers is variant for an object for a cache.
For example if you tell to the cache that the variant is Cookie , then for a same url with different cookie value the result of the cache is different.
The Server send to the proxy (in the response) which header is considered for vary by sending Vary: list of request header name

In Cachefu, you can configure that by rule with varyExpression.

In global configuration of cache fu you can also configure an global vary header. By default this configuration is send with rule.portal_cache_settings.getVaryHeader()

You can activate or desactivate vary with the header_set configuration ( vary field).

Vary headers must be present in the request (not response) of the browser in order to be considered to be variant for the proxy cache. So we are limited with the standard header of the protocol http.

But with cookie and apache (apache is in front of squid) we can elaborate strategy to construct a vary tag more efficient.

The second aspect of the cache work is purge content when the content change.

PURGE of Vary objects is still very poorly supported in squid, and you can only purge one variant at a time and need to get the URL cached again before being able to purge another variant. So how to deal with that also ?

First , how build our vary tag ?

The trick is to construct an custom vary tag with apache.
We can to do this with RewriteRule::

RewriteCond %{HTTP_COOKIE} mycookie="([^"]+) [NC]
RewriteRule ^(.*)$ - [E=mycookie:%1]

So in this example mycookie contains the value of cookie_key

You can add a cookie for the language , a cookie for group , a cookie for a permission and so on and then construct your custom vary tag with values of this specifics cookies with mod_headers

RequestHeader append MyVary %{mycookie}e

And then the value of mycookie is considered to be variant..

If you want have a specific vary tag for anonymous you can test the presence of
__ac cookie and send a custom MyVary in this case

RewriteCond %{HTTP_COOKIE} __ac="([^"]+) [NC]
RewriteRule ^(.*)$ - [E=authenticated:1]

RequestHeader append MyVary %{mycookie}e env=authenticated
RequestHeader append MyVary anonymous env=!authenticated

So now with that you can vary cache as you want. Now how to treat the big deal of purge.

The trick is  have an image (or a ajax request or ..) in content that is never in cache. This image is serve by a browser view (in case of zope application) that set a cookie. This cookie value is added to Vary tag. So the Vary tag change if the value of this cookie change and then the content is updated (for all request).

For example we can construct a cookie with the value of the catalog change

catalog_count = pcs.getCatalogCount()
context.REQUEST.RESPONSE.appendHeader('Pragma','no-cache')
context.REQUEST.RESPONSE.appendHeader('Cache-control', 'no-cache')
cookie = context.REQUEST.cookies.get('X_CACHE_CATALOG', 0)

if cookie != str(catalog_count) :
context.REQUEST.RESPONSE.setCookie('X_CACHE_CATALOG',
catalog_count ,
path="/")
return catalog_count

And in apache we add
RewriteCond %{HTTP_COOKIE} X_CACHE_CATALOG=([^"]+) [NC]
RewriteRule ^(.*)$ - [E=X_CACHE_CATALOG:%1]
RequestHeader append MyVary %{mycookie}e:%{X_CACHE_CATALOG}e env=authenticated

And when catalog change, the vary also (in the second request) and the cache is updated. You can elaborate other strategies for purging vary object with this technique.

The last point is to combined Etag and Vary Header in response. IE with a Vary header don’t treat correctly Etag header and If-None-Match is never sending. So in apache remove the tag Vary and then Etag work well for all browser

Header unset Vary


Août 25, 2009
Gael Pasgrimaud
gawel
Gawel's blurb
» buildout vs pip. Why I choose buildout

pip and buildout are two way to deal with python eggs. Both have the same functionality eg. Install python eggs in a isolated environment.

I've learn buildout first because I came from the zope world and it's the standard way to install zope for a while now. But when pip appears I give it a try.

At the first look pip seems interesting because it allow to install a bundle of packages with fixed versions. But the main problem is that pip use virtualenv to isolate packages so each time you need a new environment you need a new virtualenv and fetching all packages again. This can take a lot of time if you are using lxm or python library with C code. And more and more if you have a lot of projects.

Instead, you can share eggs between buildout's directories. You just need to tell where they are. Add this to your ~/.buildout/default.cfg:

[buildout]
eggs-directory=/home/gawel/eggs

That's all. When buildout need a egg he'll try to find it in this directory before fetching. If no version is found the egg is fetched and installed in this directory. Of course, you can have more than one version per package. You can tell buildout which version to use (see bellow).

I'm a Pylons fan so I already have all packages needed in my ~/eggs directory to install a new pylons environment. Let's create a new project in an isolated environment:

gawel:~/tmp% date
Mar 25 aoû 2009 21:49:11 CEST
gawel:~/tmp% mkdir pylons
gawel:~/tmp% cd pylons
gawel:~/tmp/pylons% vi buildout.cfg
gawel:~/tmp/pylons% buildout
Creating directory '/Users/gawel/tmp/pylons/bin'.
Creating directory '/Users/gawel/tmp/pylons/parts'.
Creating directory '/Users/gawel/tmp/pylons/develop-eggs'.
Installing eggs.
Generated script '/Users/gawel/tmp/pylons/bin/paster'.
Generated script '/Users/gawel/tmp/pylons/bin/sphinx-build'.
Generated script '/Users/gawel/tmp/pylons/bin/sphinx-quickstart'.
Generated script '/Users/gawel/tmp/pylons/bin/sphinx-autogen'.
gawel:~/tmp/pylons% ./bin/paster create -t pylons myproject
Selected and implied templates:
  Pylons#pylons  Pylons application template

Variables:
  egg:      myproject
  package:  myproject
  project:  myproject
Enter template_engine (mako/genshi/jinja2/etc: Template language) ['mako']:
(...)
  Copying templates/default_project/test.ini_tmpl to ./myproject/test.ini
Running /Library/Frameworks/Python.framework/Versions/2.6/Resources/Python.app/Contents/MacOS/Python setup.py egg_info
gawel:~/tmp/pylons% date
Mar 25 aoû 2009 21:50:24 CEST

This take 1mn13s. Now try with pip. I'll not ;)

That's the main reason why I choose buildout.

buildout also make unit testing easyer. Just put this config file in your package root:

[buildout]
newest = false
parts = eggs
develop = .

[eggs]
recipe = zc.recipe.egg
eggs =
  YourPackageName
  nose

Run buildout. And you'll be able to run ./bin/nosetests in an isolated environment with your package installed in develop mode (develop = .). I have one in all my projects if you need some examples.

Another reason is that buildout can be extended easily. One feature that exist in pip but not in buildout is the ability to fetch eggs from VCS's urls. This is not a builtin feature in buildout. But I've created a buildout extension for that (gp.vcsdevelop). And you know what ? This extension use pip !! ;) By the way, there is no plugin system in pip AFAIK.

Now the last reason. I wonder how pip's users upgrade an existing project. Do they need to install another environment ? With buildout i'ts easy. There is an extension to list all packages versions used by a buildout project. You just need to use the generated file as a buildout's version.cfg and tell buildout to use it.

[buildout]
versions = versions.cfg
...

Then update this file on your production server. Run bin/buildout again. That's it. Your project is up to date and use the correct versions just because buildout create is own sys.path with required eggs.

gawel:~/tmp/pylons% cat bin/paster
#!/Library/Frameworks/Python.framework/Versions/2.6/Resources/Python.app/Contents/MacOS/Python

import sys
sys.path[0:0] = [
  '/Users/gawel/eggs/Pylons-0.9.7-py2.6.egg',
  '/Users/gawel/eggs/PasteScript-1.7.3-py2.6.egg',
  '/Users/gawel/eggs/setuptools-0.6c9-py2.6.egg',
  (...)
  '/Users/gawel/eggs/WebHelpers-0.6.4-py2.6.egg',
  '/Users/gawel/eggs/Routes-1.10.3-py2.6.egg',
  ]

import paste.script.command

if __name__ == '__main__':
    paste.script.command.run()

For all those reason, I will not use pip for now.

I know that buildout is more complicated than pip. But it's also more powerful. So, are you planning to learn how buildout works ? If so, I've wrote a How To for Pylons. You can also find it on pylonshq. I think this can help you to learn buildout even if you don't plan to use Pylons (but you should too ;).

Août 24, 2009
Gael Pasgrimaud
gawel
Gawel's blurb
» How to use jQuery.getJson in a Firefox extension

jQuery work fine in a Firefox extension and it's easy to manipulate the DOM of the current document with it.

If you need to call some web services then it will fail. I've solved the problem and tried to submit a path on the jquery-devel mailing list.

The problem is not solved since it require some too specific Firefox code. So if you need this, check the patch I've submitted and apply the changes in the ajax function by yourself.

Août 4, 2009
Gael Pasgrimaud
gawel
Gawel's blurb
» Serving both mercurial repository and sphinx docs at the same time

A few month ago I've started a small project named MercurialApp.

The main goal is to serve my Mercurial repositories as a wsgi application with Paste. It work fine and I use it on https://hg.gawel.org.

What I'm thinking now is that it was cool to serve the project documentations with the same application. I always have a docs/ folder in all my project with the Sphinx documentation in it. So let's use it.

Now you can add a sphinx_docs option in the configuration. Then MercurialApp add a changegroup hook (applied when a push occurs) in all repositories. This hook look for a docs/conf.py in the repository and if it exist try to rebuild the documentation in {sphinx_docs}/html/{reponame}/docs/.

Then if you look at the code in MercurialApp you will see something like this:

self.app = Cascate([StaticURLParser(os.path.join(c.sphinx_docs, 'html')), hgwebdir])

As you can see the newly generated application will try to serve a static file in sphinx_docs and if it does not exist serve the hgwebdir application. This way if you try to fetch /projectname/docs/ you'll see the sphinx documentation. /projectname will serve the hgwebdir application.

Here is a sample config file for MercurialApp:

[server:main]
use = egg:Paste#http
port = 5000

[app:main]
use = egg:MercurialApp

[hg:main]
# this is a public repo served at /
# everybody can read. Only gawel can push
hgwebdir = %(here)s/public_repositories
allow_read = *
allow_push = gawel

[hg:private]
# this is a private repo served at /private
# Only gawel can read and push
hgwebdir = %(here)s/public_repositories
allow_read = gawel
allow_push = gawel

The work still in progress but I like this project because I can browse my code as usual and the docs is always up to date without any wiki and/or WYSIWYG editor.

By the way Sphinx and Mercurial are two great python projects. A bunch of non python projects use them to write documentation or store source code. But what I love is that with python you can combine two great application in one. Python give me the power !