From lists to generators
I was working on a section of my Checkers game code a few days ago. It looked like this:
class Checkerboard(object): def legal_moves(self, state): return state.captures or state.moves
The behavior above mirrors the standard Checkers rule that if you have a capture, you must take it — otherwise, you can make a regular move. In my code, both state.captures and state.moves are properties that return Python lists. Since an empty list evaluates to False in an or statement, if state.captures is empty, then the return statement automatically uses state.moves list instead.
Now I was trying to make the legal_moves method work a bit faster and use less memory, so I switched to generators instead. Oops … now the code above fails since state.captures and state.moves now return generator objects. The legal_moves method above doesn’t work because state.captures is always a valid object instance and it always evaluates to True in the or statement — even when there are no actual captures available.
It took some thought to come up with an equivalent behavior for generator objects. In the new version, legal_moves becomes a wrapper method that returns one generator object or the other based on whether the first generator object yields at least one item.
class Checkerboard(object): def legal_moves(self, state): return state.captures if state.captures.next() else state.moves
It’s not as elegant as the version using lists, but it works.
No commentsKomodo and Django and Ubuntu 9.04
Lately, I’ve found myself switching from Wing IDE to Komodo IDE for my Python work. One of the major reasons for this has been my switch to using a Mac at work. While Wing runs on the Mac, it’s really an X11 app and doesn’t have the full Mac’y feel that Komodo has. Another reason, is that while in the past, Komodo’s code completion was “less” it’s now gotten a lot better and while still not up to the excellent standard of Wing, it’s pretty good now.
So for work at home, I figured I’d fully make the switch as well. That’s when I found an issue. While I had managed to get my Mac code completing well for my django app’s, my 9.04 ubuntu box at home wasn’t running so well. Just trying to code complete the django models object didn’t work.
After trying everything I’d seen via googling, I found that adding the following directories to the additional directories import seemed to work:
/usr/local/lib/python2.6/dist-packages
/usr/local/lib/python2.6/site-packages
I’m sure I’ll end up adding more directories to make it all work right… but these fixed the basic django code completion issues. Just thought I’d post this in case I forgot what I did.
No commentsBuilding Python 2.6 executables for Windows
I finally made a build of my game, Raven Checkers, on Windows XP. In the process I ran into the same issues as everyone else with Python 2.6 and the new requirement for manifest files. Since Python 2.6 binaries for Windows are now compiled using Visual Studio 2008, a manifest file is required both for your own executable and for the C runtime that Python uses. I was not able to find a simple step-by-step process on how to do this, so here’s how I do it.
- Make sure that Python 2.6 is installed on your system with the “Install just for me” option during setup. This ensures that the DLLs and manifest files that you will need will be placed into the Python26 directory where you can easily locate them later.
- Build an executable of your program for Windows. I’m using gui2exe (0.3) and py2exe (0.6.9) to make it simple. I set the Bundle Files option to 2, and the Compression option to 0. Why, do you ask? I am unable to get any other combination of options to work properly. If you can, more power to you. I also use the option to create a dist directory called, um, ‘dist’.
- From the Python26 directory, copy msvcr90.dll, Microsoft.VC90.CRT.manifest, and python.exe.manifest over to your dist folder.
- Rename the python.exe.manifest file to [yourappname].exe.manifest.
Now you should be able to copy your dist folder over to a clean Windows machine, double-click on your executable and have it start without any side-by-side configuration issues.
By the way, If you don’t have a clean Windows machine to test on your app on, the easy and free solution is to download Microsoft Virtual PC 2007 and a free XP or Vista virtual machine image. Once VPC is installed, you can copy your files to your virtual machine by simply dragging them from your dev machine desktop onto the VPC desktop window.
Here’s to Windows deployment and testing made (relatively) simple! Hope this saves you some hours of Googling.
No commentsObject pooling in Python
This post discusses an object pool design pattern that is applicable in the following scenario:
- You have a list of objects whose state is continually updated.
- The objects are frequently added or deleted from the list, i.e. have limited lifetimes.
- You want to reuse the objects instead, to improve performance or simplify your code.
Introduction
When writing my game Asteroid Smash using Python, I used lists to keep track of on-screen objects like laser shots and asteroids. These objects all have limited lifetimes (for example, a laser shot that only goes a predetermined distance, or an asteroid that drifts across the screen until it gets destroyed). When it’s no longer needed, the object then gets deleted from its associated list. Here is a simplified example:
import random MAX_SHOTS = 5 class Shot(object): def __init__(self, lifetime): self.lifetime = lifetime def update(self): self.lifetime -= 1 return self.lifetime > 0 # create list of Shots with random lifetimes lst = [Shot(random.randint(1,100)) for x in range(MAX_SHOTS)] # update loop while lst: delete_lst = [] for i, x in enumerate(lst): if x.update(): # decrement object lifetime delete_lst.append(i) # delete any objects whose time count is 0 for i in reversed(delete_lst): del lst[i] print "Done!"
To summarize what’s going on, a list of 5 Shot objects gets created, each with random lifetime values, and added to a list. After that, a loop is started that calls an update method on each Shot that decrements its lifetime variable by 1. (This roughly simulates what would happen during each pass of a typical game loop.) Once the object’s lifetime value reaches zero, then the object is added to a delete list. Any objects in the delete list get destroyed after each pass of the update loop.
This code works, but it’s clumsy. Note all the bookkeeping that’s necessary. I can’t delete items while I’m iterating through the list, or I’ll invalidate the iterator. This means I have to keep track of the index of which items need to be deleted, and then actually delete them in a second loop. Also, I have to delete the items in reverse order, or I’ll throw off the indexes and delete the wrong items.
To improve the code, I decided to try list comprehensions to shorten the amount of code in the update loop.
# new update loop with list comp while lst: lst = [x for x in lst if x.update()]
This is much more appealing, single-pass approach. However, the list comprehension, while tidy looking, has a big disadvantage: it constantly creates and destroys a list during each iteration of the loop. This makes it a big problem if we pass around a reference to the list within our program, as the reference is immediately invalidated when the list is recreated. If we need to keep the reference to our list constant, then we have to try another approach.
Rethinking the problem
Perhaps if we want to make a better design, then we need to recast the problem slightly. Notice that once all the objects have a lifetime count of 0, then they can all be deleted at one time, which is a straightforward problem. This means filtering the list by deleting the objects turns out to be unnecessary; what we want is a way to separate objects that still need to be updated in the list from those that do not. Then, once all of the objects fail the update call, the entire list can be deleted.
A little-known algorithm in the C++ Standard Template Library (STL) called partition does the separation of elements mentioned above. Given a sequence, partition will reorder items in the sequence so that all items that satisfy a certain condition (like an update call!) precede those that fail the condition. One other great thing about partition is that it works in linear time; assuming a list of N items, then each use of partition will apply the condition check N times and swap items in the list at most N/2 times. [1]
Here’s what partition looks like in Python (translated from SGI’s version of the STL algorithm [2]; comments added by me based on the SGI docs.)
def partition(pred, seq, first, last): """ Partition reorders the elements in seq in the range [first, last) based on the function pred, such that the elements that satisfy pred precede the elements that fail to satisfy it. The postcondition is that, for some index value middle in the range [first, last), pred(x) is true for every item x in the range [first, middle) and false for every item x in the range [middle, last). The return value of partition is middle. """ # quick error check on the list indices if first > last: return 0 # find first element that fails to satisfy the predicate for i in range(first, last): if not pred(seq[i]): break else: return last # for all remaining elements in the sequence, if we find one # that satisfies the predicate, swap it with one that is known # to have (possibly) failed the test. All swaps are necessary # to preserve the correct ordering upon exit. for j in range(i+1, last): if pred(seq[j]): seq[i], seq[j] = seq[j], seq[i] i += 1 return i
We also need a predicate function to use with partition, since the update method is actually part of our Shot class and can’t be called directly. We could use a lambda function instead, but that is somewhat slower than a named function.
def update_object(x): return x.update()
Using our new partition and update_object functions, our update loop changes to:
end = len(lst) while end: end = partition(update_object, lst, 0, end)
This is one line longer than our list comprehension version, but we also don’t invalidate our reference to the list on each pass since partition performs in-place modification to our list. This makes it better on both counts.
One other item of interest is the variable end. After each pass of the loop, the end index marks the separation between “live” and “dead” objects in the list. As we noted earlier, once end reaches 0, all the items in the list can then be deleted. Look closely though … we were only deleting objects from the list before in order to filter those that didn’t need updating, but partition already does the filtering for us. Since all the objects in the list are, in fact, still good, to reuse one of them we just need to reset the object pointed to by the end index and then increment end by one. If we try to reuse one of the objects in the list and end goes beyond the last item in the list, then we can either expand the list or just ignore the request, depending on what behavior we want.
Managing details
Here’s the interesting thing: by switching our code to use partition, what we’ve created is a simple object pool. At this point though, there are too many bookkeeping details, and it would be better to encapsulate the pool inside a class.
class ObjectPool: def __init__(self, **kwargs): self._clsname = kwargs['classname'] self._args = kwargs.get('args') or [] self._num_objects = max(kwargs['num'], 0) self._pred = kwargs['update_func'] self._max_objects = kwargs.get('max', self._num_objects) # Create the objects self._objs = [apply(self._clsname, self._args) for x in range(self._num_objects)] self._end = len(self._objs) def _extend_list(self, args): """ default policy is to add one object, up to max_objects ... override with different behavior if needed. """ self._objs.append(apply(self._clsname, args)) self._num_objects += 1 def add(self, *args): """ Add one new (live) object to the pool. """ newend = self._end + 1 # if we reach our predefined maximum number of objects, # don't create any more. if newend > self._max_objects: return None # if we bump up against the end of our list but are below the # max number of objects, extend the list if newend > len(self._objs): self._extend_list(args) else: self._objs[self._end].reset(*args) self._end += 1 return self._end - 1 def update(self, *args): """ Update each object in the pool, using the pred function. """ self._end = partition(self._pred, self._objs, 0, self._end, args) return self._end
That’s perhaps a lot to digest. Let’s look at a short code sample of how to use the new class that will make it easier to understand.
shots = ObjectPool(classname=Shot, update_func=update_object, num=5) while shots.update(): pass print "Done!"
You can see here that the ObjectPool class takes keyword parameters. The first parameter, classname, specifies the name of each object’s class in the pool. The second parameter, update_func, specifies the predicate function used to update the objects. The third parameter, num, specifies the initial number of objects created. (There is also an optional fourth parameter, max, that specifies the maximum number of objects within the pool; if max isn’t explicitly specified, then it will default to num.)
Besides the update method we’ve already discussed, the add method will attempt to either insert a new object in the pool or reset an existing object whose lifetime has expired. (The add method assumes that each object in the pool has implemented a reset method that reinitializes the object.) If neither option is possible (the pool has reached its max or no objects have expired), then add will simply return None to indicate a failure. (You can override the _extend_list method if you wish to change this default behavior.)
A short demo of object pooling
To demonstrate the utility of this code, I’ve modified Alex Holkner’s Noisy demo in the pyglet 1.1.2 distribution to use my ObjectPool class. His example program displays balls bouncing around the edges of a windowed display. Normally the balls bounce indefinitely until the user presses a Delete key, but I changed the demo to have the balls exist for only a short time before they are deleted from the display. The sample also demonstrates how the ObjectPool prevents you from adding more than the max number of objects defined in its constructor. This demo, along with the provided unit tests, should give you a good starting point for using ObjectPool in your own games or applications.
References:
[1] http://www.sgi.com/tech/stl/partition.html
[2] http://www.sgi.com/tech/stl/stl_algo.h
A New Author for the Blog and Site
I’ve never really regretted keeping two separate blogs. Three was certainly too much, but having two isn’t all bad. The one thing I’ve regretted is that I’ve not done enough posts. So, to help fix that situation and also to provide a slightly more permanent home to the excellent posts that he writes, I’m pleased to announce that Brandon Corfman is now going to be co-author, if not primary author, here on this site. Brandon brings both a more academically grounded education background together with even broader practical experience programming.
Brandon is single-handed responsible for my initial interest in Python as well being the greatest of friends. So now it’s official, this site, now has gained a whole lot more credibility. At the same time, perhaps Brandon’s input will spur me on to write a few more things here as well. Enjoy!
No commentsSome Long MySQL Connection Delays
This issue has been documented before, but it took me awhile to understand what the real problem was and what my options for resolving it were.
So what happened? I had a VM (Ubuntu Hardy) chillin’ on my local dev box. That local VM was a test environment to verify the performance of a django app being hosted on a VM with Apache and mod_wsgi. And wouldn’t you know it? It was slow… very slow. As in 5 seconds per browse to display or anytime it had to hit the database. Dropping to the command-line revealed that it was the actual connecting to the mysql server (which is a separate server) that was taking most of all that time.
As it turns out, MySQL’s name lookup (reverse address resolution) was causing the slowdown. You can turn off MySQL’s reverse lookup using the “–skip-name-resolve” option. But since changing the configuration of a running MySQL server (production) is not an option, I had to figure out something else. The VM was getting a DHCP address and since our in-house (Windows-based) DNS doesn’t do well at tracking the names of DHCP’d clients that are Linux boxes, our DNS stinks at getting the name in any timely fashion. But, if you assign the client a static IP that has an official name in our DNS, voila! You have a nice, fast connection time again. A Windows box in my office, that gets assigned an IP via DHCP is fine as well, presumably because our DNS server plays well with other Windows boxes.
As I stated earlier, you can disable the name lookup but then you have to use IP addresses if you want to support security based on the connecting client’s machine. Making sure there are DNS entries for the clients seems to fix it. Again, this is mostly just for my own record if nothing else. I certainly don’t want to see this crop up again and not remember what I figured out.
No commentsHosting a Django Site With the CherryPy WSGI Server
So after hearing about CherryPy’s WSGI server while at PyCon (I went to the Pylons/TG2 classes), I decided, like others have, to see if I could host a Django site with it here where I work. There are several references out there about using that server code with Django, but I had a tough time getting to the point where I could actually use it to handle the Admin media. Some of the articles were slightly out of date with changes made to the most recent version of the CherryPy WSGI server, so while this is drawing off of several other folk’s work (like Eric Florenzano and DjangoCerise ), I present what I did to make it work while also realizing that this is probably still very incomplete. I am just hoping this will help someone get started.
Prior to the code, I downloaded just the wsgiserver.py from CherryPy. You can get the latest from trunk if you’re running Linux, with just a command like this:
wget http://svn.cherrypy.org/trunk/cherrypy/wsgiserver/__init__.py -O wsgiserver.py
Then I also wanted to have some logging capabilities, so I discovered Paste’s TransLogger middleware. All you have to do is download the translogger.py file from the paste project (sorry no quicky command for that, I grabbed it via the browser and I am too lazy to get you the command for that).
So with those 2 files (wsgiserver.py and translogger.py) now in the main directory of my particular django project (dash2 is the name in my example), I created a new cherryserve.py in that same directory as well.
import wsgiserver #This can be from cherrypy import wsgiserver if you're not running it standalone. import sys import os import django.core.handlers.wsgi from django.core.servers.basehttp import AdminMediaHandler from translogger import TransLogger if __name__ == "__main__": sys.path.append('/home/swilcox/projects') os.environ['DJANGO_SETTINGS_MODULE'] = 'dash2.settings' app = AdminMediaHandler(django.core.handlers.wsgi.WSGIHandler()) logged_app = TransLogger(app) server = wsgiserver.CherryPyWSGIServer( ('127.0.0.1', 8080), logged_app, server_name='luz.lifeway.org', numthreads = 20, ) try: server.start() except KeyboardInterrupt: server.stop()
So why did I do the path trickery up there? Well, because in my development environment, I don’t want my entire projects directory normally on my python path. Feel free to call me paranoid. But to get the server to work correctly with the django code, I needed it there. For a production server, the sys.path.append business wouldn’t normally need to be there, because I would have my apps or projects directory as part of the Python path.
Also note that if you needed to serve multiple apps, with this newer version of the wsgiserver.py, you would want to set them all up in a WSGIPathInfoDispatcher object first and then pass that in to the CherryPyWSGIServer instead of logged_app.
I think the other fields are pretty self-explanatory… ports, IP address, threads, etc…
The Admin Media trick is using app = AdminMediaHandler(django.core.handlers.wsgi.WSGIHandler()) instead of just the WSGIHandler by itself.
If you don’t want the logger, you just pass in app instead of logged_app.
MySQLdb and executemany error “incomplete format”
Okay, very weird problem today. I am not sure what to make of it… when using the latest version of MySQLdb (1.2.2.final.0) which is the version in Ubuntu 7.10 Gutsy, I am getting an “incomplete format” error when performing an executemany. My variables are presented as a list of dictionaries as I’m using named/mapped variables in the SQL statement. Using the exact same code with MySQLdb (1.2.1.final.2) is fine (the version in Ubuntu Feisty 7.04).
So for completeness sake, I have documented the scenario like this:
On a MySQL server, you just need a simple little table:
CREATE TABLE `test_table` ( `id` int(10) UNSIGNED NOT NULL AUTO_INCREMENT, `test_col_1` varchar(20) NOT NULL, `test_col_2` varchar(20) NOT NULL, PRIMARY KEY (`id`) ) ENGINE=MyISAM AUTO_INCREMENT=2 DEFAULT CHARSET=latin1
Then the Python code (obviously you’ll need to set the connection stuff appropriately):
import MySQLdb print "-----------------------------------------------------" print MySQLdb.version_info print "-----------------------------------------------------" sqlStringItems = """INSERT INTO test_table (test_col_1,test_col_2) VALUES (%(test1)s, %(test2)s);""" dtlDict = {'test1':'test value 1','test2':'test value 2'} con = MySQLdb.connect(host="localhost",user="myuser",passwd="mypassword",db="test") cur = con.cursor() exitval = cur.executemany(sqlStringItems,[dtlDict]) print "exit value from executemany= " + str(exitval) cur.close()
version (1, 2, 2, ‘final’, 0) yields:
ValueError: incomplete format
Whereas version (1, 2, 1, ‘final’, 2) works fine.
Also, doing the same thing, but with execute instead of executemany works fine (and of course taking the dict out of the list). So this seems to be purely an issue with executemany. Looking at the MySQLdb code, it’s obvious that between the two versions it was pretty radically changed.
I guess I need to report this as a bug, but I’m just finding it hard to believe that I didn’t do something wrong.
Update (2008-06-03)
Back in January, I reported this issue (1874176). Then in late April, Raphael Guillet submitted a possible fix. I haven’t spent much time looking at the right vs. wrong of the fix, but I have tried it and it appears to work. So for those looking, here’s the possible fix which as near as I can tell has still not been cut into the official version.
The change is in the cursors.py file. So you have to hunt that down wherever it is stored in your particular OS version and/or install.
Comment out lines 201 and 202:
#e = m.end(1) #qv = m.group(1)
And then add right after that:
e=len(query) qv = query[p:e]
I’ll try to continue to monitor this.
6 commentsProjects, PyCon, and Life
So just as a bookmark to myself, I wanted to document what all I was working on from a technical perspective.
On the work front, I’ve recently deployed a Django app that folks are just totally loving. It’s actually a hacked together group of scripts to import data and export data to two different legacy systems. In the middle, I use the Django admin stuff to allow folks to maintain the data, and scripts at the beginning and end of the day handle import and export duties.
I finally started hacking around on WordPress themes. I’ve not gotten very far but hopefully in the next week or so, I’ll have new designs for both sites.
Then, with regard to PyCon, I submitted some talk proposals. Not sure I should have bothered, but it seemed like the thing to do. Now I’m just curious to see if any of my proposals are accepted. One of the proposals is for a project I’ve only begun really gathering the pieces for. I’ve resolved that whether the talk is approved or not, I’m going to work away at the project and then I’ll either have a talk or I’ll have a lightning talk to present. I’ll be blogging my way through my progress here, so should have some interesting stuff to post.
No commentsMore Blog Housekeeping
I’ve finally gotten tired of my poor little blog getting attacked by spammers. Actually, it was just one in particular and I know I could’ve done some IP blocking or added to the protection scheme on my comments, but rather than working more on Pumpkinvine, I’ve now switched to WordPress. And while it seems like I’m giving up, it’s really so I have the chance to catch my breath while I use all the WordPress plugins to handle the spam for me. But for reference, I ought to package up my most recent version of Pumpkinvine and make it available for folks although it’s really a pretty rough collection of PHP scripts that approximate the same style as blosxom. So why bother creating another system? But it seemed like a good idea at the time. Probably the neatest part of Pumpkinvine was going to be the switchable back-ends so you could choose between text files, SQLite, MySQL or something else by just changing a single setting. Of course, the only back-end I had done was the text files-based core.
Ultimately, I was tired of messing with PHP. I’d like to go with more Python. Django appears to the fastest way to get to having a working blog, but lack of personal free time, and lack of desire to fight with FastCGI and Dreamhost to get up and running, kind of dampened my spirits. So I’ll settle for my clever Python script that I wrote to convert from Pumpkinvine to WordPress and some time that I’ll need to devote to learning how to create WordPress themes.
No comments