November 2008 Archives

Mon Nov 24 13:31:33 EST 2008

Ignoring files/directories in subversion

If you have a file, directory that you want subversion to ignore (for example the build directory created by python’s setuptools) you can tell subversion like this:

svn propedit svn:ignore parent_dir/

This opens an editor where you list the files or directories you want subversion to ignore. For example:

dists/*
build/*

will ignore the directories dists/ and build/ and their contents. (The wildcard is required to ignore their contents).


Posted by vschmidt | Permanent link | File under: ocean_engineering

Mon Nov 24 11:45:48 EST 2008

Subversion keywords

Here is how to enable subversion to recognize and insert metadata into a source file. (this is something I should have learned long ago)

First you have to enable the keywords you want subversion to handle:

svn propset svn:keywords "Revision Author Date" &gtfile&lt

Here we have set the Revision, Author and Date keywords as properties in which subversion is to handle for us auto-magically.

Then in the source code something like this:

$Revision$
$Date$

becomes something like this:

$Revision: 123$
$Date: 2002-07-22 21:42:37 -0700 (Mon, 22 Jul 2002) $


Posted by vschmidt | Permanent link | File under: python

Mon Nov 24 08:55:10 EST 2008

Idiomatic Python

Yesterday I found Code Like a Pythonista by David Goodger. It is, imho, a fantastic set of guidelines to take one from a novice python wannabe to a moderately decent python programmer. I love this kind of guidance. Lots of things to keep me out of trouble.


Posted by vschmidt | Permanent link | File under: python

Wed Nov 19 08:21:58 EST 2008

Bouncing Buoy

Movie

Tom testing the buoy’s transient response.


Posted by vschmidt | Permanent link | File under: ocean_engineering

Wed Nov 19 07:56:53 EST 2008

RTK buoy float test!

Yesterday we float-tested our new RTK buoy in our test tank!

Tom and Jim inserting the system in the tank.

Her maiden voyage. This simple buoy has an OEM RTK GPS, EDL radio modem, Aanderaa conductivity and temperature sensor and an [Ocean Server attitude and compass] (http://www.ocean-server.com/compass.html). In addition there is a [Persistor micro-processor] (http://www.persistor.com/) for power control and system monitoring and a [4-port B&B Electronics serial-WiFi Server] (http://www.bb-elec.com/product_family.asp?FamilyId=145&TrailType=Sub&Trail=26) to send serial data from each component to shore. Oh and I almost forgot, there’s a [WHOI micro-modem] (http://acomms.whoi.edu/umodem/) as well.

We've got lots of plans for these systems – including very precise acoustic positioning of underwater gear (AUV’s, cameras, etc.), measurement of tides, measurement of waves and we don’t know what else.


Posted by vschmidt | Permanent link | File under: ocean_engineering

Wed Nov 19 07:40:41 EST 2008

MacOS 10.5 Battery Calibration

I did not know that Apple recommends re-calibrating the battery every couple of months!


Posted by vschmidt | Permanent link | File under: mac

Wed Nov 19 07:34:32 EST 2008

Macbook Pro Fixed!

I am thrilled to report – two of the three biggest oddities with my new MBP have been Fixed! The first was compatibility with a Dell external display. The second was track-pad behavior in which the track-pad would not respond or would not execute a click to change context between windows and applications.

The remaining buzz-kill is the fact that the system doesn’t not properly sleep. When you shut the top to go home, assuming your system will happily go to sleep until tomorrow you come back to find your system dead and needing boot from scratch. This even happens sometimes when you sleep manually. Very frustrating and a real pain, as I have become accustomed to my Macs doing the right thing. Hopefully this will be fixed soon too.


Posted by vschmidt | Permanent link | File under: mac

Tue Nov 18 12:27:05 EST 2008

Python and BeautifulSoup

I have always wondered how changes in politics and news affects the “most emailed” list of articles posted by the NY Times. So I've written a short script to parse and log that list that I can run from a cron job at regular intervals. This was a fun exercise and helped me learn some cool things about python, to include a very handy html/xml parsing engine – BeautifulSoup. It was not part of my standard install (EPD) so I installed it with:

sudo easy_install BeautifulSoup

Extracting from my code, these are the parts that make it all work:

Import the module:

from BeautifulSoup import BeautifulSoup

Get the web page:

url = 'http://www.nytimes.com'                                                               
page = urllib2.urlopen(url)                                                                  
soup = BeautifulSoup(page.read())  

*Parse the page, looking for the mostEmailed section.

me = soup.find('div', id="mostEmailed").findAll('a',href = re.compile('.*')) 

This line looks for a set of div tags in which the id is mostEmailed. This returns the entire section of list entry tags and their contents, but we only want the contents of each anchor tag within each list entry. So the findAll statement returns all the anchor tags. This statement requires an argument to match against, so I'm matching against the .* regular expression which, of course matches everything.

Extract the data

for item in me:                                                                              
    melist.append(item.get('href'))                                                          
    metitle.append(item.renderContents())  

Finally the urls are extracted from the anchor tag data by “getting” the href data and the title of each article is returned with renderContents.

I am absolutely sure there is a more direct method within BeautifulSoup to extract this data, but as a first go, this works fine.

I'm saving the results in a flat file that looks like this:

2008-11-17T19:23:49.772734      1       http://www.nytimes.com/2008/11/16/arts/design/16ouro.html?em    Architecture: Saving Buffalo’s Untold Beauty
2008-11-17T19:23:49.772734      2       http://www.nytimes.com/2008/11/16/opinion/16rich.html?em        Frank Rich: The Moose Stops Here
2008-11-17T19:23:49.772734      3       http://www.nytimes.com/2008/11/17/opinion/17mcwilliams.html?em  Op-Ed Contributor: Our Home-Grown Melamine Problem

So now how to illustrate the data in some meaningful way.


Posted by vschmidt | Permanent link | File under: python

Tue Nov 18 09:02:51 EST 2008

Gracie and the rice cereal

Gracie, working on driving her own spoon, took a big scoop, passed up her mouth, turned it upside down and smeared it on the top of her head. Something everyone should do once.


Posted by vschmidt | Permanent link | File under: family

Wed Nov 12 16:59:37 EST 2008

Only the main thread handles signals in Python

I learned just now that only the main thread in python handles signals. Signals are caught and placed in a queue which is checked at the execution of each line of code. So in code like this:

LOG = True
while LOG:
    try:
        data = myqueue.get()
        print data
    except (KeyboardInterrupt, SystemExit):
        LOG = False

the program will block on the myqueue.get() call and not catch a Ctrl-C until something comes out of the queue. One must instead write it something like this:

LOG = True
while LOG:
    try:
        data = myqueue.get_nowait()
        print data
    except (KeyboardInterrupt, SystemExit):
        LOG = False
    except (Queue.Empty):
        continue

Here myqueue.get_nowait() will return immediately if nothing ins in the queue, raising a Queue.Empty exception, which I catch and gracefully ignore. Everything would be hunky-dory except even this code sometimes fails to quit properly with a Ctrl-C. The resulting traceback seems to focus on the myqueue.get_nowait() or the except (Queue.Empty) lines, indicating that the Keyboard Interrupt was caught in execution of these lines and therefore somehow not caught by the try statement. Why it’s not caught by the try statement I'm not sure yet.


Posted by vschmidt | Permanent link | File under: python

Wed Nov 12 12:39:41 EST 2008

More about timing with Python

Ok, I'm learning more about getting precise calendar timing in python so I thought I would document it a bit.

Python has the commonly used time module time. time.time() returns seconds since the epoch (Jan 1, 1970). On Mac/Linux you get values with meaningful fractional seconds to 1 microsecond precision. I think, on these OSs the gettimeofday() C function is used to implement time.time().

On Windows, the gettimeofday() call is not available to time.time(). It is not exactly clear to me at this point what function time.time() calls on Windows, however, the result gives values reported to 1 microsecond, but only values to 1 millisecond are meaningful (the latter three digits are typically 0’s). Moreover, although the time.time() call on windows produces values that change with 1 millisecond precision, these values are not actually updated every millisecond. It seems the rate at which they are updated is hardware dependent, many report an update rate between 10 and 20 ms.

In addition to time.time(), there is time.clock(). On Linux/Mac systems this call is not helpful, as it produces a time value in seconds of CPU time. However on Windows time.clock() gives the seconds since the process was started and these are reported to the precision of the operating system. On Windows, time.clock() uses the QueryPerformanceCounter() method which measures time based on CPU clock cycles. To use time.clock() as a wall-clock (i.e. calendar time), one must then initialize it with call when the process executes, coinciding with a time.time() call to get the wall-clock date-time stamp. Then subsequent time stamps can be measured accurately by subsequent calls to time.clock() which then gives the offset to the initial time stamp. This general method is implemented in code posted in this thread.

This is not the end of the story however. For a really good read on these issues, have a look at When Microseconds Count. Here the folks at IBM tackle the problem much the same way, in a C library, that handles a few of the other ugly bits we've overlooked. The biggest issues, as I can see, in the python code snippet posted at the link above, that’s addressed in this IBM article directly, is the fact that time.clock() returns a 64 bit value which will eventually wrap for long-running processes. Unfortunately, for whatever reason, the code and executables once accompanying this article have been removed and are listed as retired. I'm not sure what that means, perhaps there is now a better way.

Finally, depending on how QueryPerformanceCounter is called, on a multi-processor (and maybe multi-core?) machine, you may get varying results. There is a mention of this in [this documentation.] (http://msdn.microsoft.com/en-us/library/ms644904.aspx)

Finally as an aside note, the python module timeit uses time.clock() on Windows and time.time() on everything else.


Posted by vschmidt | Permanent link | File under: mac, windows, linux, python

Tue Nov 11 18:19:29 EST 2008

Python Time

Here are some interesting things I've learned about handling time in python (for windows)

1) time.time() has millisecond precision on a Windows machine, and microsecond precision on a Mac or Linux machine. 2) datetime.dateime is the same 3) time.clock() gives seconds since the process was started at much greater precision, but it is not portable (windows only). Also it is not clear how quickly time.clock() updates, nor how one would correlate it’s output with system time since it is a relative measure.

By comparison, MATLAB on windows reports time in about 2e-7 second steps. (I determined this with

t = zeros(1,10000)
for x = 1:length(t)
    t(x) = now

plot( t - t(1), '.')

We have successfully using MATLAB’s now function to time the arrival of serial strings that contain GPS time stamps and then used these as a lookup table to determine the UTC time of other events that have also been time-stamped with the now function. However the coarseness of python’s abilities to time stamp things in windows is going to make this hard. Hmm.


Posted by vschmidt | Permanent link | File under: python

Tue Nov 11 17:06:20 EST 2008

Parallels

Ok, I did someething just now I swore I would never, ever do. I ran Hyperterminal from a windows install running on Parallels because I could not figure out how to make Zterm add line feeds to the end of each line sent. It is a crazy mixed up world when I have to revert to Hyperterminal for additional functionality.


Posted by vschmidt | Permanent link

Tue Nov 11 14:40:49 EST 2008

Markdown Markup

This is the place to find documentation for the Markdown plugin which provides for wiki-type simplified markup. The implementation I'm actually using in nanoblogger is written in python, but – same thing.


Posted by vschmidt | Permanent link | File under: nanoblogger-help

Tue Nov 11 14:36:02 EST 2008

Posting images

After many iterations I think the easiest way to post images to nanoblogger is to put the images in blog/images/ and then refer to them in the post like this

<img src="images/file.jpg" width="400" />

Caption (two spaces down)


Posted by vschmidt | Permanent link | File under: nanoblogger-help

Tue Nov 11 13:59:03 EST 2008

Murray's Cheese Shop

Alice, Gracie and I were in New York City last week and visited the extraordinary Murray’s Cheese Shop (actually the one in Grand Central Market, not Bleeker St.)

Lots of cheese to try!

And fun tubs of stuff to check out and scatter on the ground!


Posted by vschmidt | Permanent link | File under: family

Tue Nov 11 13:19:21 EST 2008

Python UDP logging to/from the Broadcast address

Here’s a nugget of info that makes writing to and capturing data from the broadcast address easy.

Notes for configuration of udp data logging/monitoring with python.

# To send data on the braoccast address:
##############################################################
#!/usr/bin/env python
import socket
import re

s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
s.setsockopt(socket.SOL_SOCKET, socket.SO_BROADCAST, 1)
localipaddress = socket.gethostbyname(socket.gethostname()) 
udpbcstaddress = re.sub('.\d+$','.255',localipaddress)      
s.sendto( message, ( udpbcstaddress, port)

# To receive data on the broadcast address:
##############################################################

#!/usr/bin/env python
import socket, traceback
s=socket.socket( socket.AF_INET, socket.SOCK_DGRAM)
s.setsockopt(socket.SOL_SOCKET, socket.SO_BROADCAST, 1)
s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
host = ''   # binds to all interfaces, otherwise give ip address
port = 33333

s.bind( (host, port) )
while 1:
       try:
               message, address = s.recvfrom(1024)
               print address
               print message
       except (KeyboardInterrupt, SystemExit):
               raise
       except:
               traceback.print_exc()

As a side note, I wanted to do this with Python’s logging package, however I could not find a straight-forward way to set the options on the sockets that allow use of the broadcast address, and the default for the logging package seems to be to pickle the strings before sending them which means I'd have to have a python routine to capture them on the other end to unpickle. Not what I want.


Posted by vschmidt | Permanent link | File under: python

Wed Nov 5 16:12:48 EST 2008

Python debugging

This is a nice link for working in the python debugger (pdb).


Posted by vschmidt | Permanent link | File under: python

Wed Nov 5 14:50:12 EST 2008

Using the Python debugger with iPython

I will never remember how to do this, so better document it here.

To drop into the python debugger in a program insert this into the code where to the program to start

import pdb

pdb.set_trace()

Once pdb starts, you can drop into an ipython shell with the following commands:

from IPython.Shell import IPShellEmbed
IPShellEmbed([])(local_ns=locals()) 

This will put you in ipython and give you the complete environment of your program, and of course all of ipython’s helpful documentation. To return to pdb simply

Ctrl-d


Posted by vschmidt | Permanent link | File under: python

Wed Nov 5 14:34:34 EST 2008

ConfigParser not case sensitive

This has had me perplexed for a few hours now and so different from my expectation of things that I generally consider well thought-out and programmed that it warrants comment.

In Python the ConfigParser module provides for easy reading in of a config file having key-value pairs. It’s a handy tool, but surprising the keys in the key-value pairs are not case sensitive (everything gets converted to lower case). This is the kind of mistake that I don’t expect in Python. Err.


Posted by vschmidt | Permanent link | File under: python

Mon Nov 3 07:39:54 EST 2008

Deep sea submarine pioneer dies

Saturday Jacques Piccard died.

Swiss-based marine explorer and inventor Jacques Piccard, who was part of the deepest submarine dive in history, has died at his home aged 86.

Here’s the link


Posted by vschmidt | Permanent link | File under: news