Tuesday, February 9, 2010

Open Sourcing Research Software

An article in the Guardian calls for researchers to open source (release to the public) the computer code they use in their research.

As a former programmer, I think that this is a great idea. It is surprisingly easy for even the most talented programmer to make simple mistakes in their code that cause their program to provide erroneous, misleading results. Asking for the computer code to be released to the public will allow skeptics and peer reviewers the chance to criticize how data was analyzed. This criticism can catch mistakes and lead to more powerful experiments, but will researchers have too much ego to release their code?

In industry, programming errors are caught by demanding that programmers test their own code and then having a team of testers test the code. Unfortunately, the luxury of a robust testing team is not afforded to many researchers. Also, it is hard to expect, for example, a biology researcher, who is a self taught programmer, to create a detailed and powerful test harness for his software.

I would actually be surprised to see the open sourcing of research code become a common practice because I think many inexperienced programmers who program for research will be too embarrassed to release their code in a domain where professional software developers are able to criticize their work. I blame this on the programming profession rather than the researchers. Programmers are notorious for being outspoken and rude when commenting on amateur code. Another barrier to this practice is that code that is being released to the public domain needs to be readable/understandable, instead of being readable to only the programmer who wrote the code. This preparation will add time to the already busy schedules of most researchers.

Unfortunately, I suspect this will be one of those great ideas that many support, but few practice.

No comments: