Friday, January 27, 2012

Software Processes as Organizational Diagnostic Tools

In general, I dislike complex software development methodologies and consider them unnecessary.  I won't go into a full rant here since I think Ted Dziuba sums it up well in "Who Needs Process?" but I will say that I agree with his core points - resist growing teams too large, keep the skill level of individuals as high as possible, and don't participate in blamestorming sessions when something goes wrong.

I think simple software development processes usually have some common characteristics:

  • They originate within the development team as a way to increase their efficiency and ability to cooperate
  • Duplication of effort is eliminated as much as possible and automation of tasks is introduced at every opportunity
  • Artifacts produced by this process are designed to minimize interactions with other teams within the organization
  • Processes should be self-documenting as much as possible
All of these characteristics are focused on a single goal: increasing the efficiency of the individual developer.  If you focus on quality developers, you will have a small team and you need to maximize throughput.  This is why your processes must increase efficiency.

If you start to notice aspects of your software development process which don't contribute directly to efficiency, you can use those components to identify weaknesses in your organization.  In this respect, your software development process becomes a diagnostic tool to find diseases within your organization.

For example, I've worked on several development teams which were required to deliver an impact document to the QA and business teams with each release.   This impact document outlined the changes made in the release, listed which components of the application were effected, and provided guidelines for how to test the release and validate it for production-readiness.

This document was not increasing the efficiency of the development team - the impact of changes was already known by the developers and had been covered in unit testing.  It was seen as busy work by the developers and was often considered an afterthought.  As such, it sometimes didn't even provide all the details which the QA and business teams requested.

What this task really represented was a deficiency in the organization - the QA and business teams weren't able to test all the functionality in the application to meet their aggressive release cycles so they wanted to focus their testing on what had changed in each release.  Their inability to test functionality indicated one of several problems:
  • The release schedule was too aggressive for such a complex application and should be scaled back (smaller change sets or less frequent releases)
  • The QA and business teams were relying too heavily on manual testing and not taking advantage of automated test scripts
  • The QA teams needed an increased headcount
  • The QA and business teams needed training in version control to determine the changes in a release for themselves
Any of these options would have solved the actual problem in a better fashion than the inclusion of an "impact document" in the development process.   In my opinion, the problem was a lack of automated testing.  Fully automated testing would have caught defects which somehow made it past unit testing - even in portions of the application which were not perceived as being effected by the release.  In many cases, the impact document caused extra trouble for the teams. In addition to being a drain on the development team, it created a false sense of security for the QA and business teams when creating their test plans.

In my experience, development teams in large IT organizations are often at the bottom of the food chain and find these types of steps bolted on to the development processes to address organizational issues in other areas or as a reaction to perceived failures.  By re-examining your development process and working constructively with other teams, you can eliminate inefficiencies in your process and improve the health of your entire organization.

Wednesday, January 18, 2012

Yet Another Clojure Book

I received my copy of "Clojure in Action" by Amit Rathore today.  This will be my third book on clojure.  I already own "Programming Clojure" by Stuart Halloway  and "The Joy of Clojure" by Michael Fogus and Chris Houser.

I came to Clojure after more than ten years of Java programming and five years of Python programming so I was very familiar with the JVM but also had some background with functional concepts.  First-class functions and list comprehensions were a big part of why I liked Python and I wanted to see how they performed on the JVM.  Learning a Lisp variant has been on my to-do list for many years now and Clojure seemed like a good option.  I've been learning it on-and-off for the past twelve months and have been very impressed with it (and humbled by it).

Both "Programming Clojure" and "The Joy of Clojure" are excellent books.  Halloway's book was my first exposure to clojure the language and provided an excellent introduction to all the important concepts.  By the time I had read it, I was writing code in REPL and creating scripts.  As many reviewers have noted, it does a great job of explaining the "what" of Clojure.

After six months with "Programming Clojure", I got my copy of "The Joy of Clojure".  This book was a great introduction for idiomatic Clojure and I feel like it advanced my knowledge of how Clojure is supposed to work.  I'm still internalizing much of the content and feel like it's a book I'll revisit many times over the years before I fully understand all the intricacies of the language.

I'm hoping that "Clojure in Action" fills in some gaps in my Clojure knowledge.  Specifically, I'm looking for some recommendations on best practices and examples on how to build successful projects.  In reviewing the table of contents and skimming the pages, it appears that part II of the book will provide the the information I'm seeking.  I'll definitely read part I as well to see how Rathore approaches the material in the other books.

I'm excited to read the book and hope to post a review soon.  I'll have to hurry, though - it looks like a new edition of "Programming Clojure" is coming out in a couple of months and I might need to pick that one up, too.


Tuesday, January 17, 2012

Whoosh Search Indexing

I've been using Apache Lucene to build e-commerce search engines for the last six years.  The search engines are built using the Java version of Lucene and include quite a bit of custom functionality for filtering, blocking, sorting, and faceted navigation.

I've been aware of the Whoosh library for Python for a year or so but I've never had a chance to use it much.  I know it provides much of the same functionality as Lucene with similar internals but I had not built any projects using it.

Due to some annoyances with version control, I've decided to jump in and give Whoosh a trial run.  I've often wanted to search through commit messages at work to find commits relating to specific defects or commits made by specific developers.

I decided to start simply and build a library using the following requirements:

  • Each project (tag or branch) in subversion would be a separate index
  • Each index would use a basic schema to encapsulate svn commit messages
  • Both full indexing and incremental indexing (via post-commit hooks) should be supported
  • Searching should be available on commit message, author, revision, and date.
With my first attempt, I've implemented searching and full indexing.  I haven't bothered with the incremental indexing yet but expect to build that in the future.

The similarities between Whoosh and Lucene are obvious and not unexpected - Whoosh claims Lucene as one of its "ancestors" in the Introduction page.  Because of my previous experience with Lucene, learning Whoosh was mostly a matter of making connections between Java concepts in Lucene and the analogous modules in Whoosh.

Overall, I'm pretty pleased with the simplicity of the Whoosh library - it provides the same power as Lucene but in a much friendlier (to me) pythonic model.  In future, I'll be checking out more advanced concepts such as localization, faceted search, and incremental indexing to see how Whoosh compares to Lucene in those regards.  I expect the major difference between Whoosh and Lucene will be in search performance.  For this project, though, that should not be a concern.

The source for the main library is shown below or you can follow the project at https://github.com/khill/svnsearch.