November 2008 Archives
Mon Nov 24 13:31:33 EST 2008
Ignoring files/directories in subversion
If you have a file, directory that you want subversion to ignore (for example the build directory created by python’s setuptools) you can tell subversion like this:
svn propedit svn:ignore parent_dir/
This opens an editor where you list the files or directories you want subversion to ignore. For example:
dists/*
build/*
will ignore the directories dists/ and build/ and their contents. (The wildcard is required to ignore their contents).
Mon Nov 24 11:45:48 EST 2008
Subversion keywords
Here is how to enable subversion to recognize and insert metadata into a source file. (this is something I should have learned long ago)
First you have to enable the keywords you want subversion to handle:
svn propset svn:keywords "Revision Author Date" >file<
Here we have set the Revision, Author and Date keywords as properties in which subversion is to handle for us auto-magically.
Then in the source code something like this:
$Revision$
$Date$
becomes something like this:
$Revision: 123$
$Date: 2002-07-22 21:42:37 -0700 (Mon, 22 Jul 2002) $
Mon Nov 24 08:55:10 EST 2008
Idiomatic Python
Yesterday I found Code Like a Pythonista by David Goodger. It is, imho, a fantastic set of guidelines to take one from a novice python wannabe to a moderately decent python programmer. I love this kind of guidance. Lots of things to keep me out of trouble.
Wed Nov 19 07:56:53 EST 2008
RTK buoy float test!
Yesterday we float-tested our new RTK buoy in our test tank!

Tom and Jim inserting the system in the tank.

Her maiden voyage. This simple buoy has an OEM RTK GPS, EDL radio modem, Aanderaa conductivity and temperature sensor and an [Ocean Server attitude and compass] (http://www.ocean-server.com/compass.html). In addition there is a [Persistor micro-processor] (http://www.persistor.com/) for power control and system monitoring and a [4-port B&B Electronics serial-WiFi Server] (http://www.bb-elec.com/product_family.asp?FamilyId=145&TrailType=Sub&Trail=26) to send serial data from each component to shore. Oh and I almost forgot, there’s a [WHOI micro-modem] (http://acomms.whoi.edu/umodem/) as well.
We've got lots of plans for these systems – including very precise acoustic positioning of underwater gear (AUV’s, cameras, etc.), measurement of tides, measurement of waves and we don’t know what else.
Wed Nov 19 07:40:41 EST 2008
MacOS 10.5 Battery Calibration
I did not know that Apple recommends re-calibrating the battery every couple of months!
Wed Nov 19 07:34:32 EST 2008
Macbook Pro Fixed!
I am thrilled to report – two of the three biggest oddities with my new MBP have been Fixed! The first was compatibility with a Dell external display. The second was track-pad behavior in which the track-pad would not respond or would not execute a click to change context between windows and applications.
The remaining buzz-kill is the fact that the system doesn’t not properly sleep. When you shut the top to go home, assuming your system will happily go to sleep until tomorrow you come back to find your system dead and needing boot from scratch. This even happens sometimes when you sleep manually. Very frustrating and a real pain, as I have become accustomed to my Macs doing the right thing. Hopefully this will be fixed soon too.
Tue Nov 18 12:27:05 EST 2008
Python and BeautifulSoup
I have always wondered how changes in politics and news affects the “most emailed” list of articles posted by the NY Times. So I've written a short script to parse and log that list that I can run from a cron job at regular intervals. This was a fun exercise and helped me learn some cool things about python, to include a very handy html/xml parsing engine – BeautifulSoup. It was not part of my standard install (EPD) so I installed it with:
sudo easy_install BeautifulSoup
Extracting from my code, these are the parts that make it all work:
Import the module:
from BeautifulSoup import BeautifulSoup
Get the web page:
url = 'http://www.nytimes.com'
page = urllib2.urlopen(url)
soup = BeautifulSoup(page.read())
*Parse the page, looking for the mostEmailed section.
me = soup.find('div', id="mostEmailed").findAll('a',href = re.compile('.*'))
This line looks for a set of div tags in which the
id is mostEmailed. This returns the entire section of
list entry tags and their contents, but we only want the contents
of each anchor tag within each list entry. So the
findAll statement returns all the anchor tags. This
statement requires an argument to match against, so I'm matching
against the .* regular expression which, of course
matches everything.
Extract the data
for item in me:
melist.append(item.get('href'))
metitle.append(item.renderContents())
Finally the urls are extracted from the anchor tag data by
“getting” the href data and the title of each article
is returned with renderContents.
I am absolutely sure there is a more direct method within BeautifulSoup to extract this data, but as a first go, this works fine.
I'm saving the results in a flat file that looks like this:
2008-11-17T19:23:49.772734 1 http://www.nytimes.com/2008/11/16/arts/design/16ouro.html?em Architecture: Saving Buffalo’s Untold Beauty
2008-11-17T19:23:49.772734 2 http://www.nytimes.com/2008/11/16/opinion/16rich.html?em Frank Rich: The Moose Stops Here
2008-11-17T19:23:49.772734 3 http://www.nytimes.com/2008/11/17/opinion/17mcwilliams.html?em Op-Ed Contributor: Our Home-Grown Melamine Problem
So now how to illustrate the data in some meaningful way.
Tue Nov 18 09:02:51 EST 2008
Gracie and the rice cereal

Gracie, working on driving her own spoon, took a big scoop, passed up her mouth, turned it upside down and smeared it on the top of her head. Something everyone should do once.
Wed Nov 12 16:59:37 EST 2008
Only the main thread handles signals in Python
I learned just now that only the main thread in python handles signals. Signals are caught and placed in a queue which is checked at the execution of each line of code. So in code like this:
LOG = True
while LOG:
try:
data = myqueue.get()
print data
except (KeyboardInterrupt, SystemExit):
LOG = False
the program will block on the myqueue.get() call and not catch a Ctrl-C until something comes out of the queue. One must instead write it something like this:
LOG = True
while LOG:
try:
data = myqueue.get_nowait()
print data
except (KeyboardInterrupt, SystemExit):
LOG = False
except (Queue.Empty):
continue
Here myqueue.get_nowait() will return immediately
if nothing ins in the queue, raising a Queue.Empty
exception, which I catch and gracefully ignore. Everything would be
hunky-dory except even this code sometimes fails to quit properly
with a Ctrl-C. The resulting traceback seems to focus on the
myqueue.get_nowait() or the except
(Queue.Empty) lines, indicating that the Keyboard Interrupt
was caught in execution of these lines and therefore somehow not
caught by the try statement. Why it’s not caught by the try
statement I'm not sure yet.
Wed Nov 12 12:39:41 EST 2008
More about timing with Python
Ok, I'm learning more about getting precise calendar timing in python so I thought I would document it a bit.
Python has the commonly used time module time.
time.time() returns seconds since the epoch (Jan 1,
1970). On Mac/Linux you get values with meaningful fractional
seconds to 1 microsecond precision. I think, on these OSs the
gettimeofday() C function is used to implement
time.time().
On Windows, the gettimeofday() call is not
available to time.time(). It is not exactly clear to
me at this point what function time.time() calls on
Windows, however, the result gives values reported to 1
microsecond, but only values to 1 millisecond are meaningful (the
latter three digits are typically 0’s). Moreover, although the
time.time() call on windows produces values that
change with 1 millisecond precision, these values are not actually
updated every millisecond. It seems the rate at which they are
updated is hardware dependent, many report an update rate between
10 and 20 ms.
In addition to time.time(), there is
time.clock(). On Linux/Mac systems this call is not
helpful, as it produces a time value in seconds of CPU time.
However on Windows time.clock() gives the seconds since the process
was started and these are reported to the precision of the
operating system. On Windows, time.clock() uses the
QueryPerformanceCounter() method which measures time
based on CPU clock cycles. To use time.clock() as a
wall-clock (i.e. calendar time), one must then initialize it with
call when the process executes, coinciding with a time.time() call
to get the wall-clock date-time stamp. Then subsequent time stamps
can be measured accurately by subsequent calls to
time.clock() which then gives the offset to the
initial time stamp. This general method is implemented in code
posted in
this thread.
This is not the end of the story however. For a really good read on these issues, have a look at When Microseconds Count. Here the folks at IBM tackle the problem much the same way, in a C library, that handles a few of the other ugly bits we've overlooked. The biggest issues, as I can see, in the python code snippet posted at the link above, that’s addressed in this IBM article directly, is the fact that time.clock() returns a 64 bit value which will eventually wrap for long-running processes. Unfortunately, for whatever reason, the code and executables once accompanying this article have been removed and are listed as retired. I'm not sure what that means, perhaps there is now a better way.
Finally, depending on how QueryPerformanceCounter
is called, on a multi-processor (and maybe multi-core?) machine,
you may get varying results. There is a mention of this in [this
documentation.]
(http://msdn.microsoft.com/en-us/library/ms644904.aspx)
Finally as an aside note, the python module timeit
uses time.clock() on Windows and
time.time() on everything else.
Tue Nov 11 18:19:29 EST 2008
Python Time
Here are some interesting things I've learned about handling time in python (for windows)
1) time.time() has millisecond precision on a Windows machine, and microsecond precision on a Mac or Linux machine. 2) datetime.dateime is the same 3) time.clock() gives seconds since the process was started at much greater precision, but it is not portable (windows only). Also it is not clear how quickly time.clock() updates, nor how one would correlate it’s output with system time since it is a relative measure.
By comparison, MATLAB on windows reports time in about 2e-7 second steps. (I determined this with
t = zeros(1,10000)
for x = 1:length(t)
t(x) = now
plot( t - t(1), '.')
We have successfully using MATLAB’s now function to time the arrival of serial strings that contain GPS time stamps and then used these as a lookup table to determine the UTC time of other events that have also been time-stamped with the now function. However the coarseness of python’s abilities to time stamp things in windows is going to make this hard. Hmm.
Tue Nov 11 17:06:20 EST 2008
Parallels
Ok, I did someething just now I swore I would never, ever do. I
ran Hyperterminal from a windows install running on Parallels
because I could not figure out how to make Zterm add line feeds to
the end of each line sent. It is a crazy mixed up world when I have
to revert to Hyperterminal for additional functionality.
Tue Nov 11 14:40:49 EST 2008
Markdown Markup
This is the place to find documentation for the Markdown plugin which provides for wiki-type simplified markup. The implementation I'm actually using in nanoblogger is written in python, but – same thing.
Tue Nov 11 14:36:02 EST 2008
Posting images
After many iterations I think the easiest way to post images to nanoblogger is to put the images in blog/images/ and then refer to them in the post like this
<img src="images/file.jpg" width="400" />
Caption (two spaces down)
Tue Nov 11 13:59:03 EST 2008
Murray's Cheese Shop
Alice, Gracie and I were in New York City last week and visited the extraordinary Murray’s Cheese Shop (actually the one in Grand Central Market, not Bleeker St.)


Lots of cheese to try!

And fun tubs of stuff to check out and scatter on the ground!
Tue Nov 11 13:19:21 EST 2008
Python UDP logging to/from the Broadcast address
Here’s a nugget of info that makes writing to and capturing data from the broadcast address easy.
Notes for configuration of udp data logging/monitoring with python.
# To send data on the braoccast address:
##############################################################
#!/usr/bin/env python
import socket
import re
s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
s.setsockopt(socket.SOL_SOCKET, socket.SO_BROADCAST, 1)
localipaddress = socket.gethostbyname(socket.gethostname())
udpbcstaddress = re.sub('.\d+$','.255',localipaddress)
s.sendto( message, ( udpbcstaddress, port)
# To receive data on the broadcast address:
##############################################################
#!/usr/bin/env python
import socket, traceback
s=socket.socket( socket.AF_INET, socket.SOCK_DGRAM)
s.setsockopt(socket.SOL_SOCKET, socket.SO_BROADCAST, 1)
s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
host = '' # binds to all interfaces, otherwise give ip address
port = 33333
s.bind( (host, port) )
while 1:
try:
message, address = s.recvfrom(1024)
print address
print message
except (KeyboardInterrupt, SystemExit):
raise
except:
traceback.print_exc()
As a side note, I wanted to do this with Python’s logging package, however I could not find a straight-forward way to set the options on the sockets that allow use of the broadcast address, and the default for the logging package seems to be to pickle the strings before sending them which means I'd have to have a python routine to capture them on the other end to unpickle. Not what I want.
Wed Nov 5 16:12:48 EST 2008
Python debugging
This is a nice link for working in the python debugger (pdb).
Wed Nov 5 14:50:12 EST 2008
Using the Python debugger with iPython
I will never remember how to do this, so better document it here.
To drop into the python debugger in a program insert this into the code where to the program to start
import pdb
pdb.set_trace()
Once pdb starts, you can drop into an ipython shell with the following commands:
from IPython.Shell import IPShellEmbed
IPShellEmbed([])(local_ns=locals())
This will put you in ipython and give you the complete environment of your program, and of course all of ipython’s helpful documentation. To return to pdb simply
Ctrl-d
Wed Nov 5 14:34:34 EST 2008
ConfigParser not case sensitive
This has had me perplexed for a few hours now and so different from my expectation of things that I generally consider well thought-out and programmed that it warrants comment.
In Python the ConfigParser module provides for easy reading in of a config file having key-value pairs. It’s a handy tool, but surprising the keys in the key-value pairs are not case sensitive (everything gets converted to lower case). This is the kind of mistake that I don’t expect in Python. Err.
Mon Nov 3 07:39:54 EST 2008
Deep sea submarine pioneer dies
Saturday Jacques Piccard died.
Swiss-based marine explorer and inventor Jacques Piccard, who was part of the deepest submarine dive in history, has died at his home aged 86.
Here’s the link