Mon Mar 23 09:25:37 EDT 2009

Documenting Your Work

A few years ago I was working on a project that collected copious amounts of data that many people were working up simultaneously. We weren’t communicating well and it was killing me. So I drafted up this email (I'm not sure I actually ever sent it.) and today I found it in my “drafts” folder. I think it’s good stuff so I'm reproducing it here. Take heed…


Ok, I have a Huge pet peeve that I need to rant about. Please buckle in.

If any of us do any data processing and produce things data products and or plots, we must PLEASE PLEASE document what we've done in the appropriate Data Processing Narrative. [There should be one for each field test and each desktop study.] 

If you don’t document what you've done your work is WORSE THAN USELESS to the rest of us. It is a colossal waste of time to have to reverse engineer someone else’s work. You must capture everything you've done. Where was the raw data? Where is your matlab or other script? Where did the plots go that were generated? Where have the results of your calculations been saved?. 

You need not document every detail of every step in the Narrative, PROVIDED, you have commented your scripts well and in meaningful ways. 

One should be able to read through the Narrative and see the exact state of our data processing for a particular test. If you think your processing pipeline is an offshoot of the main effort, it’s still good to document what you've done. Perhaps someone will read the narrative, find the results of your work and build on it in a way you hadn’t thought of. 

This is the essence of collaboration. It seems like it takes more work to work together. But we save 1000’s of times the effort by not having to throw out your work because no one understands it. I have met people who work with the ethos that obscurity equals job security. I hate this. It is counter to everything I know about efficiency, learning and progress. 

This problem is inherent in data archives, not just at CCOM but around the country. Data is being lost at an alarming rate, not because it’s actually removed off a disk, but because the people who know anything about it never wrote anything down and never captured the pieces of information necessary to understand how to process the data or what processing has been done. 

In short, you must document in this digital forum in the SAME WAY you've been taught to document in a Lab notebook. 

Thanks for listening.

-Val


Posted by vschmidt | Permanent link