Most recent call

Introduction to Selenium for Python programmers

2011-01-19T05:10:00.000-08:00

Selenium is an application that automates web browsers, helping you test your web application from a user perspective, in an automated manner. These properties make Selenium tests a perfect fit for validating your js-level functionality and implementing acceptance tests.

Of course, it has some drawbacks: you need to run your application from another process, which gives you some pain with checking the backend state of things. The tests might be quite slow, and - if you don't write them well - extremely fragile.

However, starting with basic Selenium tests is very simple, which I'm going to prove below. We will create a trivial website, with a single element only: a link to Google. Next, we will implement a Selenium test that makes sure this indeed happens.

Prepare the environment

We will work in a virtualenv called "seltest". If you don't know what virtualenv is, you likely want to read this first. Enter the directory of your choice and run the following commands:

mkdir seltest
cd seltest
virtualenv --no-site-packages .

We will work from within the seltest directory, so that we don't need to activate the virtualenv, and instead call our binaries by "bin/python" or "bin/pip".

Let's download the Selenium executable and its python bindings:

wget http://selenium.googlecode.com/files/selenium-server-standalone-2.0a4.jar
bin/pip install selenium

You can already play with Selenium. First, start another terminal window and run:

java -jar selenium-server-standalone-2.0a4.jar

Then from our main terminal:

$ bin/python
>>> from selenium.remote import connect
>>> from selenium import FIREFOX
>>> browser = connect(FIREFOX) # this will run the browser
>>> browser.get("http://www.yahoo.com") # you should see the browser navigating to yahoo
>>> browser.close() # this will close the session

Prepare and run the website

The website will consist of a single link, we can skip all the obligatory html boilerplate at this stage. Save the text

<a href="http://google.com">Go to Google</a>

into a file index.html, and from another terminal (yes, you will need three terminal windows) run:

Python -m SimpleHTTPServer

This will start serving your page on the port 8000, you can visit the page from your web browser on http://localhost:8000/

Implement the test

Open the file selenium_test.py in your editor of choice and dump the following

import unittest
from selenium.remote import connect
from selenium import FIREFOX

class SelTest(unittest.TestCase):
    def setUp(self):
        self.browser = connect(FIREFOX)
    def tearDown(self):
        self.browser.close()
    def test_simple(self):
        self.browser.get("http://localhost:8000/")
        link = self.browser.find_element_by_partial_link_text("Google")
        link.click()
        self.assertEqual(self.browser.get_title(), "Google")

if __name__ == "__main__":
    unittest.main()

The setUp and tearDown methods manage the browser session, and the actual test lives in the test_simple method. We are using four methods from the browser object: get, find_element_by_partial_link_text, click and get_title. In case you wonder where these come from, look for the WebDriver class definition. You can find it in lib/python2.6/site-packages/selenium/remote/webdriver.py in your environment.

Run the test

Now, you are ready to run your test.


bin/python selenium_test.py

You should see something along the lines of:

$ bin/python selenium_test.py 
.
----------------------------------------------------------------------
Ran 1 test in 5.521s

OK

Which indicates, that all your tests passed correctly.

Optimizing fabfiles

2011-01-15T11:18:00.000-08:00

I really like my deploys to be as fast as possible. Unfortunately, the RTT between my and my server makes this quite hard. Today, I came up with a simple optimisation, that lets you make your fabric commands faster (saving on RTT). Say you have a series of consecutive "run" calls. Each call needs to get sent, evaluated and the results need to come back. Why wait for them, when we don't want to continue after failure anyway? The simple fix is to change this:

def my_task():
    run("command_1")
    run("command 2")
    run("command 3")

... into this:

def my_task():
    commands = []
    _run = commands.append
    _run("command_1")
    _run("command 2")
    _run("command 3")
    run(" && ".join(commands))

This way, all your commands get called, and the execution still stops on first failure.

Customising Django's uniqueness validation message

2011-01-09T16:47:00.000-08:00

In case you've been wondering: you need to override the unique_error_message method on your model. The unique_check argument is a tuple containing field names that are supposed to be unique together (for regular uniqueness this is a one-element tuple). See the example below for validating the slug field:

class MyModel(models.Model):
    slug = models.SlugField(max_length=200, unique=True)
    def unique_error_message(self, model_class, unique_check):
        if unique_check == ("slug",):
            return u"This slug is already taken"
        else:
            return super(Office, self).unique_error_message(model_class, unique_check)

Issues with Django and MySQL on Mac OS X

2011-01-03T13:26:00.000-08:00

Today, I spent more time than planned setting up a Django installation with MySQL database backend. Below are the problems I encountered and the ways how I dealt with them.

mysql_config

I'm using what's supposed to be the simplest MySQL installation out there - the official dmg from http://dev.mysql.com/. The installer puts all the files into

/usr/local/mysql

(with the regular bin, lib and include directories inside). The installation also includes the usual mysql_config executable which takes care for pointing out all the paths required by the MySQL-python package. The problem is that without the system path preconfigured the MySQL-python installer isn't able to find it. I had to point out to the package where mysql_config is. Luckily, there's a setting for that. I downloaded the package:

bin/pip install MySQL-python --no-install

(it raised an error but downloaded the files just fine). Then, in a file called site.cfg, I switched the following line

#mysql_config = /usr/local/bin/mysql_config

into

mysql_config = /usr/local/mysql/bin/mysql_config

- uncommenting it and putting the right path in place. I was free to continue with the installation:

bin/pip install build/MySQL-python

The library path

Unfortunately, this didn't solve all the problems, as I kept getting the following error when trying to run any code that used the package:

Error loading MySQLdb module: dlopen(/path/to/site-packages/_mysql.so, 2): Library not loaded: libmysqlclient.16.dylib

As it turns out, the Ruby guys have a similar problem with that, and it's in a Rails-related post, where I found a solution. For some mysterious reason, the name of mysqlclient library was saved without the full absolute path, as I could see running the otool command:

$otool -L lib/python2.6/site-packages/_mysql.so 
lib/python2.6/site-packages/_mysql.so:
 libmysqlclient.16.dylib (compatibility version 16.0.0, current version 16.0.0)
 /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 125.2.1)

When I changed it to the full path with the install_name_tool command:

sudo install_name_tool -change libmysqlclient.16.dylib /usr/local/mysql/lib/libmysqlclient.16.dylib lib/python2.6/site-packages/_mysql.so

all worked fine. Hope that helps someone in similar trouble.

Mocking empty collections with FalseMock

2010-12-20T07:54:00.000-08:00

A friend of mine described to me a PITA he had with mock - it doesn't play well with a common Python idiom:

if collection:
    for element in collection:
        do_something(element)

What he expected as default behaviour was for the mock to be iteratable as an empty collection. Instead, he got:

>>> m = mock.Mock()
>>> for x in m:
...     print x
... 
Traceback (most recent call last):
  File "", line 1, in 
TypeError: 'Mock' object is not iterable

As it turns out, there is a simple solution to this problem: all you need to do is to implement the magic method yourself.

>>> class FalseMock(mock.Mock):
...     def __nonzero__(self):
...             return False

This behaves as you would expect: It evaluates to False.

>>> m = FalseMock()
>>> bool(m)
False

It still works as every other Mock object would:

>>> m.a.b.c.d.return_value = "hi"
>>> m.a.b.c.d()
'hi'

It keeps its behaviour in all Mocks generated as a result of getting an attribute

>>> m.a.b.c.d

Distutils2 Summer of Code

2010-08-16T11:02:00.000-07:00

This summer, I took part in Google Summer of Code project as a student. I worked on distutils2 project implementing new commands and improving the existing ones. In poarticular, my tasks were:

to implement a test command (similar to the one from setuptools/distribute
implement command hooks
port the upload_docs command
enhance the check command

I started with a low hanging fruit: porting upload_docs. This didn't take very long. In fact I spent most of my time with the test code, implementing mock PyPI server, which - I'm proud to say - turned out quite useful for other student's tests (although, my initial implementation used wsgiref - not available in Python 2.4 and Alexis refactored practically all of it, so it's rather his child).

After that, I started working on the test command. The new command's API is a little bit different than the one you might know from setuptools/distribute: the options "test-suite" and "test-loader" got replaced with "suite" and "runner". You would use "suite" option in place of "test-suite", but setting the "runner" to a dotted path to a Python callable will cause the test command to invoke that callable in place of default unittest test runner. In absence of both options, the test command will invoke test discovery as implemented in unittest (Python 2.7, 3.2 and newer) or unittest2 (if installed, for older Python versions).

The next task was to implement command hooks. The command pre- and post-hooks are Python callables that accept the command instance (giving it access to all its options). You specify them in your setup.cfg file as the command's options. In addition to that, you have to specify a postfix for you hook, so that it wouldn't override hooks from other files:

[install]
  pre-hook.my_postfix = path.to.hook

The improvements to the check command were arguably most neglected task in my project. I gathered the optional checks (originally only check for validity of ReStructuredText in package's description) under a single option --all, and implemented additional check: one that validates importability of the hooks.

Not all changes are upstream yet. The hooks and the upload_docs command are already in the central repo. The test command and check improvements are waiting for the merge. Tomorrow, I'm leaving for my summer kayak trip, but I'm looking forward to contributing more to distutils in the future.

Overall, I am very happy with the time I spent working on the project. Even in spite of some problems in my way, I enjoyed the experience of working with a team of extremely smart guys. The regular IRC meetings were instructive and enjoyable. I would also like to thank my mentor: Fred Drake, who offered all the help and feedback I needed.

Mock recipes

2010-06-12T12:57:00.000-07:00

I posted recently a simple introduction to the mock library. Today, I'd like to show you some of its indirect uses, that I found helpful in my everyday testing needs.

Custom patcher

It is very easy to set up mock objects to respond to events. You just access attributes and set either them or their return value. When using the patch decorator, a mock instance is provided to your test method as the last argument, so you set it up just before excercising your production code.

@mock.patch("sys.stdin")
def test_stdin_reader(mock_stdin):
    mock_stdin.write.return_value = "hello"  # setting up mock_stdin
    run_code_under_test()    
    validate_results()

In this example the setting up part is really easy - just a single line. But in real-life projects, you might find yourself setting whole attribute hierarchies on mocks, repeatedly in every many test methods.

That's when a custom patcher allows you to implement the DRY principle in an elegant manner.

class MyTest(unittest.TestCase):    
    @patch_db    
    def test_one(self):
        pass
    @patch_db    
    def test_two(self):
        pass

It's very easy to write a custom patcher.

def patch_db(func):    
    @patch("db.module")    
    def wrapper(*args):        
        mock_db = args[-1]        
        # ... perform complicated setup on mock_db        
        func(*args[:-1])    
    return wrapper

Cutting off

The cut_off patcher is useful when you need to mock out several objects, and don't care about the mock instances. Say you want to deactivate internet access in a module that uses both urllibs.

@cut_off("mymodule.urllib", "mymodule.urllib2")
def test_something():
    pass

The implementation of cut_off is very simple too.

def cut_off(*patches):    
    def decorator(meth):        
        def wrapper(self, *args):            
            return meth(self, *args[:-len(patches)])        
        for obj in patches:            
            wrapper = mock.patch(obj)(wrapper)        
        return wrapper    
    return decorator

Analyzing PyPI packages

2010-06-07T02:46:00.000-07:00

PyPI, which stands for Python Package Index is a global repository for Python
packages. Every time you need a Python tool or library, you can simply type
easy_install mypackage, and have it downloaded and
installed for you. It is also a great source when trying to investigate current
practices in the Python world.

Disclaimer

There are couple of troubles when analyzing PyPI. First - it is a moving target.
Since I first run the download script (which was 3 days ago), it grew by 20 new
packages. So, please bear in mind, this information won't very exact. Still, it
provides a nice overview. Second - not all packages are hosted in PyPI. For some
(quite a lot, actually) cases, we only get a link to the actual download source.
This grows the chance of a host being, and causes the download to fail. Third -
PyPI packages are terribly diverse. In order to analyze it in a timely manner, I
picked only the ones that could be downloaded as either tarballs or zips. This
reduced the sample by a quarter (from 10112 to 7625), which I believe is still a
representative enough.

Setup.py usage

Most of the packages (96%) used setup.py. The rest either simply didn't use it
or used a non-standard directory layout (accordingly: 187 and 47). Out of
setup.py users, setuptools was more than three time more popular than standard
distutils. 73 packages couldn't be identified as using either of these, and this
is mostly caused by custom setup function wrappers (see 4Suite for example of
this).

Test runners

I was curious, how people run their tests, so I identified several ways it could
be done:

using a top-level shell script: 20
using a top-level python script: 326
using setuptools' test command: 961

Note: these stats don't include another popular way of running tests, used by
Django apps.

There where 1048 packages having a toplevel directory containing string "test",
among which the most popular varations were unsurprisingly "test" (477) and
"tests"(456).

5 things you can do with a Python list in one line

2010-06-01T01:56:00.000-07:00

This is directly inspired by an excellent post by Drew Olson 5 things you can do with a Ruby array in one line. When reading it, I couldn't help but thinking of the Python versions (and how I like them more :>). So here it is:

Summing elements

puts my_array.inject(0){|sum,item| sum + item}

sum(my_list)

Double every item.

my_array.map{|item| item*2 }

[2 * x for x in my_list]

Finding all items that meet your criteria.

my_array.find_all{|item| item % 3 == 0 }

[x for x in my_list if x % 3 == 0]

Combine techniques.

my_array.find_all{|item| item % 3 == 0 }.inject(0){|sum,item| sum + item }

sum(x for x in my_list if x % 3 == 0)

Sorting.

my_array.sort
my_array.sort_by{|item| item*-1}

sorted(my_list)
sorted(my_list, reverse=True)

Mockity mock mock - some love for the mock module

2010-05-28T03:24:00.000-07:00

It looks like Python mocking libraries are the new web frameworks - everyone wrote one. Let me show you my favourite mocking library so far - the mock module, written by Michael Foord. It's easy_installable, so you can get to play with it in a moment.

easy_install mock

What makes it different than other modules like that? Most mocking libraries follow the old record-replay pattern, which works roughly like this:

Teach a mock what to expect.
Stick it into your code and expect an explosion if something didn't go as planned.

The excercise-inspect pattern in mock module reverses that approach:

Stick the Mock object into your code (Mock being the class provided by mock module).
Make sure that things happened the way you intended.

The tricky part is: how the hell is Mock able to fit everywhere? Well, it lets you do anything with it: get and set all attributes, and even call it when needed. And it leaves other Mocks everywhere it goes (it pretty much works like cancer). Let's play with Mock a little bit:


>>> from mock import Mock
>>> m = Mock()
>>> m.foo = 1
>>> m.foo # all attributes are recorded
1
>>> bar_attribute = m.bar # if no attribute was set, a mock instance is returned
<mock.Mock object at 0xdcff0>
>>> m.bar == bar_attribute # an attribute that was queried once stays there
True
>>> ret_val = m() # we can even treat mock as a function
>>> m() == ret_val == m.return_value # with similar cacheing behaviour
True
>>> m.bar() # this is true of all methods as well (they're Mocks to, right?)
<mock.Mock object at 0xe4070>
>>> m.bar.called # mock remembers everything that happened to it
True
>>> m.baz.return_value = "hello!" # if we really need some setup
>>> m.baz() # nothing prevents us from doing it
'hello!'
>>> m.who.was.this.demeter.anyway.return_value.and_its_attribute # so just remember
<mock.Mock object at 0xe42f0>
>>> _ == m.who.was.this.demeter.anyway().and_its_attribute # there are Mocks all the way down!
True

Pretty cool, huh? But wait, there's more! Why should we laborously prepare, and then revert all the mocking ourselves? Let the library stick it where we need. We just need to import patch decorator.

import sys
from mock import patch

def read_4_chars():
    return sys.stdin.read(4)

@patch("sys.stdin")
def test_something(mock_stdin):
    mock_stdin.read.return_value = "abcd"
    assert read_4_chars() == "abcd"
    assert mock_stdin.read.called
    assert mock_stdin.read.call_args == ((4,), {})

def test_something_2():
    """You can use it with *with* statement as well!"""
    with patch("sys.stdin") as mock_stdin:
    mock_stdin.read.return_value = "abcd"
    assert read_4_chars() == "abcd"
    assert mock_stdin.read.called
    assert mock_stdin.read.call_args == ((4,), {})

There is much more that mock module has to offer. I suggest reading mock's excellent documentation.

How to mount .bin disk image on Linux/Mac (without .cue file)

2009-09-10T03:13:00.000-07:00

First, you need binchunker - a utitily to convert the bin/cue pair into an iso image. It is available on macports and should be easy to get on most linux distros.

Now, you need to prepare your *.cue file. Suppose the you have a "blah.bin" file. Enter following text into "blah.cue":

FILE “blah.bin” BINARY
TRACK 01 MODE1/2352
INDEX 01 00:00:00

Now, enter the following spell:

bchunk blah.bin blah.cue blah.iso

There, you have it.

Thanks to these guys for help.

Reporting assertions as a way for test parametrization

2009-05-14T02:51:00.000-07:00

There's been a discussion recently on Testing In Pyton mailing list on introducing test parametrization. Here's my approach to the problem.

The idea, instead of finding a way to generate test methods, is to promote assertions to report the results in similar manner that testmethods do. In order to achieve that, I had to override few methods in TestCase, but the resulting API looks quite clean.
You can grab the source from here. And here's an example. Run this:

import unittest
from reporting_assertion import TestCase, reporting

class MyTestCase(TestCase):
    @reporting
    def some_assertion(self, a, b):
        assert a == b
    def test_example(self):
        for x in (1, 2, 3):
            self.some_assertion(x, 2)

if __name__ == "__main__":
    unittest.main()

In order to get this report:

$ python example.py   
FF.
======================================================================
FAIL: test_example.some_assertion(1, 2) (__main__.MyTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/[path]/reporting_assertion/reporting_assertion.py", line 12, in wrapper
    ret = assertion(self, *args, **kwargs)
  File "example.py", line 8, in some_assertion
    assert a == b
AssertionError

======================================================================
FAIL: test_example.some_assertion(3, 2) (__main__.MyTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/[path]/reporting_assertion/reporting_assertion.py", line 12, in wrapper
    ret = assertion(self, *args, **kwargs)
  File "example.py", line 8, in some_assertion
    assert a == b
AssertionError

----------------------------------------------------------------------
Ran 1 test in 0.001s

FAILED (failures=2)

This approach has few drawbacks. You can't collect assertions when doing collect-only run, it's tightly coupled with TestCase implementation (needs to know how to report results), and - for the same reason - it works only for assertions bound to TestCase instance (the decorator has an optional _self= keyword argument for decorating unbound assertions, but the results are not reusable across TestCase instances). The last problem is __str__ evaluation: if the objects under have side-effects in this method, you might get quite unexpected results.

It is, however quite powerful: the data fed into the assertions is generated in the code itself (so the parameters don't need to be static). It's also very easy to use: just decorate your assertion in order to get the reports.

Quote

2009-02-09T16:03:00.000-08:00

I read git magic today and found a link to the original thread on LKML. One of neat quotes in there is a line from Linus' post dated 8th of April 2005:

"git" is really trivial, written in four days. Most of that was not actually spent coding, but thinking about the data structures.

Generic constructor assertion

2009-02-02T08:09:00.000-08:00

It is quite common to write such initializers:

class Foo(object):
    def __init__(self, a, b):
        self.a = a
        self.b = b

Why not test it in a generic way?

import inspect

def assert_simple_constructor(klass):
    allargs = inspect.getargspec(klass.__init__)[0]
    args = allargs[1:] # skipping self
    instance = klass(*args)
    for argname in args:
        assert hasattr(instance, argname) # skipped the message
        assert getattr(instance, argname) == argname # to keep short

Why "Why 'Why OO Sucks' Sucks"... exposes ignorance

2009-01-25T15:17:00.000-08:00

A year and a half ago, I wrote a blog post which questioned some statements from Joe Armstrong's 'Why OO Sucks' article. While I still can't agree with all the article, I can see what he *really* meant and what I misunderstood then.

So let's get back to the original article and its statements:

1. Data structure and functions should not be bound together.

My response to that went mainly along the lines of "but that's how people tend to think, isn't it". However, what I think now is... well why not? Even in functional languages functions that handle specific types of data know the details of the data, so in order to understand the function you need to know what the data looks like. In order to use that function you need to feed it with the specific type of data it needs. Not so separate anymore?

Binding data and functions in OO languaged serves mainly two purposes: either to give some specific view or attribute of an instance (in which case they're nothing more than a convenient interface pure function calls) or to change the instance's inner state (to which I'll get back later).

2. Everything has to be an object.

Again, I said it's comfortable and natural to consider data as objects, but what I didn't know by then was the difference between classes and data types. I didn't know yet about pattern matching and other ways of working with immutable data. All those techniques make working with it equally comfortable and natural (maybe in more mathematical spirit, but I doubt it should be a disadvantage for anyone).

A little wiser now, I'm quite convinced that this argument comes down again to objects having private state. If there was an OO language where all objects would have to be immutable, I think the need of turning every piece of data into an object wouldn't be so painful.

3. In an OOPL data type definitions are spread out all over the place.

OK, here I was silly. First, I wrote "OO laguages" while *clearly* thinkng "Java" (well, that was the single OO language I knew then). Second: I totally forgot about inheritance.

What I think now is that there's no much difference here between OO and functional languages. If you consider inheritance as a specific case of composition (which it is) you get an equivalent of nested data types with functions handling one type are using other functions to handle the other. Not much difference from functional languages.

4. Objects have private state.

Ah, so here we have it. After playing a bit with functional languages I learnt that state, while unavoidable, is in fact a nasty bastard. Joe Armstrog doesn't like the way the state problem was solved in OO languages. They go along the lines of "yeah... we have state, but don't mind it. As long as you give us correct requests you'll be fine", which isn't true because giving correct requests (calling correct methods) requires to know the state in the first place.

The functional languages tend to use pure functions what lets them access the state only when neccessary (and FPL desingers solve the state problem in several different interesting ways).

To be honest. I kind of agree with that now. I don't like mutable objects. You never know what to expect from them :)

Ah, and my arguments here were also quite silly. First: I felt already that state is not so good, but only in the context of multithreading. Second: I was a fan of information hiding and working with tightly encapsulated data. While I'm not saying that encapsulation is bad (on the contrary!), I think I understand it differently now.

Encapsulation is not putting something in black box and forcing others not to see the its inner parts. It's rather finding a "sweet spot" in control flow: the interface that would be convenient enough for anyone to use without a need to look deeper. In other words: I don't associate encapsulation with information hiding anymore.

Concluding this longish article: I still don't agree with the statements of Joe Armstrong's article - in particular: the quiet assumption that a good language is a functional language. But I have to admit that my former article was based mainly on ignorance and misunderstanding Joe Armstrong's arguments.

And I still don't like the conspiration theory in the last paragraph there :)

ilen for Python

2009-01-19T15:30:00.000-08:00

Today I came across what turned out to be not-so-common need. I wanted to get the length of the generator. Before you comment - yes, I know generators can yield infinite sequences, and that might be one of the reasons why such a function is nowhere to be found. However, if one knows what one's doing it can be of some use. After all, calling list() constructor on iterables is equally risky. Of course, the function is trivial, but managed to give me that that-must-be-somewhere-already feeling. It wasn't :(

def ilen(sequence):
    result = 0
    for _ in sequence:
        result += 1
    return result

BTW. I think itertools is my favourite Python module, what's yours?

Howcome so many 8s?

2009-01-01T15:10:00.000-08:00

Below, you can see plot of the results of the following code:

import sys
for x in range(200):
    print x, sys.getrefcount(x)

I read recently that integers less than 256 are being interned by CPython interpreter, so I took a look at how they are used. First 80 integers has about 20 refs, with much bigger number for the initial couple of numbers. Then the refcount is usually less than 10. For some reason, there is a whole lot of references to the number 8.

EDIT: I didnt' notice the 8's leap on debian machine. It's reproducible though on my mac.

Rails tip - how to see clearly the templates rendered for a query?

2008-11-27T05:49:00.000-08:00

Not *that* inventive, but I somehow didn't think of it before:

tail -f log/development.log | grep Rendered

Getting svn:externals to work with git-svn and... Leopard

2008-10-08T00:04:00.000-07:00

There IS a simple solution to work conveniently in git-svn with repository that has svn:externals set. You can clone all repositories to seperate trees, and instead of watching svn:externals, make hard links to the trees you need (remember to ignore them in git). The problem is you can't have hard links to directories... unless you're using OS X Leopard (they implemented it in order to get Time Machine to work).

This ability isn't exposed to the command line (ln makes an explicit check to prevent you from doing that), but the code to have it working(scroll down to listing 4) is trivial.

Update: When working with hard-linked directories, remember to be careful when unlinking them: rm wouldn't unlink the directory unless provided with "-r" option, which deletes everything recursively, leaving other hard links to that directory empty.

Harness the power of wget

2008-09-23T01:36:00.000-07:00

I finally invested enough time in reading wget's man page to be able to download file hierarchies exposed through http file server. The magic command is:
wget -r -np -nH --cut-dirs=2 http://example.com/some/nested/directory
The switches mean:

-r: turns of recursive download
-np: no-parent - only links to files below in the hierarchy are followed
-nH: no-host - doesn't dump the files into directory named after host name
--cut-dirs=2: in addition to skipping host name skips also 2 topmost directories. In the example instead of some/nestes/directory the files are put straight into directory

Getting fresh datastore while testing GAE apps

2008-05-21T10:42:00.000-07:00

If you want to find out how to test you GAE apps, check out these two URLs first:
http://groups.google.com/group/google-appengine/msg/9132b44026040498
http://farmdev.com/thoughts/45/testing-google-app-engine-sites/

I want to show you how to clear the datastore quickly (yep, only that :>):


from google.appengine.api import apiproxy_stub_map

def clear_datastore():
    datastore = apiproxy_stub_map.apiproxy.GetStub('datastore_v3')
    datastore.Clear()

Outputting pdfs with google app engine

2008-04-29T05:40:00.000-07:00

It's child easy, but not googleable yet. I used reportlab - a handy, pure-python pdf library. Here are the steps:

Checkout reportlab into your app directory

svn co http://www.reportlab.co.uk/svn/public/reportlab/trunk \
reportlab

Write some pdf-generation code, like

import wsgiref.handlers

from google.appengine.ext import webapp

from reportlab.pdfgen import canvas


class MainPage(webapp.RequestHandler):

    def get(self):
        self.response.headers['Content-Type'] = 'application/pdf'
        p = canvas.Canvas(self.response.out)
        p.drawString(100, 750, "Hey, it's easy!.")

        p.showPage()
        p.save()



def main():
    application = webapp.WSGIApplication([('/', MainPage)], debug=True)
    wsgiref.handlers.CGIHandler().run(application)

if __name__ == "__main__":
    main()

The key is the canvas instantiation. The constructor takes a file-like object and writes into it. You might have noticed that the code is heavily inspired by this tutorial. The only difference is the creation of canvas.

That's it. See? I said it was easy. You can run the dev_appserver or upload it to Google in order to see the results.

At PyCon

2008-03-13T17:28:00.001-07:00

Yaaaay, after 9h flight, finally here.

Brand new blog engine without a single line of code

2008-02-16T11:27:00.000-08:00

Having investigated the new Google D&S forms feature, I decided to create a proof-of-concept blog application (OK - calling it an application is a *bit* an overstatement). It has some basic features, like form for entering entries and a feed. I doesn't have a site and is pretty much crippled. Still, quite good for 0 lines of code.

Here's the feed

And here's the new entry form. Feel free to post.

Implementing Arc's function-negation operator in Python

2008-02-03T05:00:00.000-08:00

Hey look! This guy implemented Arc's function-negation operator in Chicken. And he claims to be a Pythoneer, traitor ;-) !

I just investigated the possibilities of doing more or less the same thing in Python. Of course: you can't change the syntax, and you can't change default function's behaviour (which is not having "~" operator implemented). Or can't you?

Let's start with something more explicit for a moment. Nothing prevents us from creating a callable decorator that implements __invert__ something like this:

class invertible:

    def __init__(self, foo):
        self.foo = foo

    def __call__(self, *args, **kwargs):
        return self.foo(*args, **kwargs)

    def __invert__(self):
        def wrapper(*args, **kwargs):
            return not self.foo(*args, **kwargs)
        return wrapper

Implementing __init__ enables us to call invertible() with an argument, and implementing __call__ makes the instances of our class callable. These two work together in a way that an ordinary decorator works: It's a callable, that takes a callable and returns another callable - easy, huh?
Implementing __invert__ lets us use the "~" operator on the instances of our class, like this:

truth = invertible(lambda: True)
not_truth = ~truth
assert truth()
assert not not_truth()

This implementation, even though it's easy, is not finished yet. The inverted value is a regular function, which means, that we couldn't invert it again. The easy solution to that is decorating the wrapper:

class invertible:
    # ...
    def __invert__(self):
        @invertible
        def wrapper(*args, **kwargs):
            return not self.foo(*args, **kwargs)
        return wrapper

But this would cause the wrappers to accumulate on inverting. We can also check explicitly if we get an invertible on input and set a switch on each instance

class invertible:

    def __init__(self, foo):
        self.foo = foo
        self.inverted = False

    def __call__(self, *args, **kwargs):
        if self.inverted:
            return not self.foo(*args, **kwargs)
        else:
            return self.foo(*args, **kwargs)

    def __invert__(self):
        ret = invertible(self.foo)
        ret.inverted = (not self.inverted)
        return ret

Now, with the decorator ready, you can do some *evil* magic with frames and namespaces in order to decorate everything in scope, but I'm leaving it as an exercise to readers ;P