Quicksnip - Pickle Problems in Python 3

Python 3 has been out for quite some time, and it’s still notoriously ignored by the majority of the community. Luckily, all major libraries that I make use of have already made the small leap and as behaviour has only been slightly changed I never ran into any Py3 specific issues. Until today, that is. After writing quite the collection of classes, and storing their initialized states along with several module-specific objects all in one container, it was ready to be pickled and transferred. My fingers being crossed and naive high hopes notwithstanding, after a few lines of log code rolled over the screen the following presented itself:

_pickle.PicklingError: Can't pickle <class 'dict_keys'>: attribute lookup
dict_keys on builtins failed

Having no clue what went wrong yet, I instinctively looked up the error - and was presented by the dreaded scenario as so well described by xkcd:


In my case, however, it was just a single dumped log snippet. Crap. The chances of these dumps being linked in a chat somewhere are way higher than them actually being traceable. And although this was the case, the IRC server was friendly enough to log the conversation. That pretty much saved my monday morning. Turned out the bug was actually not that complex, and it has to do with the Py2 to Py3 transition of merging the good old dict.iterkeys commando into dict.keys. Let’s reproduce:

>>> class Foo:
...     def __init__(self):
...         self.d = {'John': 1, 'Marie': 2}
...         self.k = self.d.keys()
>>> x = Foo()
>>> import pickle
>>> pickle.dump(x, open('foo.pickle', 'wb'))

Which again yields _pickle.PicklingError: Can’t pickle <class ‘dict_keys’>: attribute lookup dict_keys on builtins failed. It didn’t occur to me at first that the dict_keys type was hereby converted into a generator. It’s pretty much hidden away in:

In [1]: type({'John': 1, 'Marie': 2}.keys())
Out[1]: <class 'dict_keys'>

In retrospect, this makes a lot of sense as the iter method was intended to be a generator version for dealing with big dictionaries. There are very good reasons why one should not want to pickle a generator. Quoting Alexandre Vassalotti :

Since a generator is essentially a souped-up function, we would need to save its bytecode, which is not guarantee to be backward-compatible between Python’s versions, and its frame, which holds the state of the generator such as local variables, closures and the instruction pointer. And this latter is rather cumbersome to accomplish, since it basically requires to make the whole interpreter picklable. So, any support for pickling generators would require a large number of changes to CPython’s core.

Anyway, problem solved. To make this mistake better recorded I thought I might as well post it. Hope it helps!