Part 1: The woes of the testbot

The intent of this series of posts is not to blame people, but rather to point out the testbot needs full-time attention. Integral to this story are the decisions and circumstances that led me to stop working on SimpleTest in core and the "testbot" which runs on qa.drupal.org. I intend to follow-up this post with others dealing with rejuvenation of the testbot and improvements to SimpleTest. I understand some will not agree with my position, but I would like everyone to understand my reasons and intentions, and how we find ourselves in the current state of affairs. After everything is out in the open, my hope is that a useful discussion will ensue and meaningful progress will result.

Factors

Four factors led me to stop working on SimpleTest in core and the testbot:

  • I no longer had gratuitious amounts of free time.
  • I now had a need to make a living (and working on the testbot does not generate any income).
  • The core development process being what it is led to burnout and lack of desire.
  • The request to stop working on the testbot in conjunction with the Drupal 7 code freeze.

With me out of the picture, it magnified the fact that noone else worked on the testbot and, going forward, noone stepped up to take my place.

Background

Lets start off with some background about my involvement with the Drupal testing story.

SimpleTest's journey to core

Rewind the clock back to early 2008. I had gotten involved in Drupal through GHOP and became maintainer of SimpleTest. I proceeded to perform a large-scale refactoring and cleaning up of SimpleTest. This, combined with other community efforts, resulted in SimpleTest being added to Drupal 7 core during the Paris Coding Sprint. The rapid pace at which I was able to develop SimpleTest quickly slowed as I no longer had the ability to commit changes nor make design decisions. Instead, even the most trivial changes took days or weeks to get committed. In spite of these additional challenges, I continued to diligently work on SimpleTest in core. To my dismay I discovered on multiple occasions that large changes were virtually impossible to push through the core queue, and I spent countless hours rerolling patches and refactoring code at various developers' whims. In the end, the patches simply died, but not for lack of quality or merit.

SimpleTest Transition to Core Commit Log
The chart shows 37 commits to the SimpleTest project before and after it was added to Drupal core. It is clear the pace of development slowed immediately and lessened further with time.

Changing course I focused on small changes to SimpleTest in core, but ran into similar throughput issues. For all intents and purposes, my ability to make contributions to SimpleTest had ground to a halt. This led me to write a blog post detailing the problem and possible solutions. I was not alone in my conclusions and many would still like to see the problem resolved. I continued to contribute to core now and then, but I was completely burned out. I even took month long breaks from Drupal as it literally burned me out to try to make any contribution to core. My burnout was not caused by overwork but was due to frustration with the exaggerated length of time to accomplish a minor commit.

Following up SimpleTest with the testbot

On a parallel track, getting SimpleTest into core turned out to be only half of the battle. Actually seeing the tests adopted and maintained remained a challenge. I led the charge to keep the tests in sync (initially doing it almost alone). The effort to create an automated system for running the tests had been underway for quite some time, but lacked the necessary volunteers and commitment to really get it off the ground. I was then asked to take over the project at which point I evaluated its status and decided to start over. I created PIFR, a plan for realizing the goal, and proceeded to rapidly make progress. Testing.drupal.org launched shortly afterward and testing became an integral part of the Drupal core workflow.

With a working system I then laid plans for a second iteration of the testbot with a number of improvements. After heavy development the second generation of the testing system was launched with a massively improved feature set.

Seeking sponsors

After graduation from high school I was no longer financially able to devote large portions of my time to the testing system or core development so I sought sponsors to enable me to continue my work. Acquia provided an internship that allowed me to focus on testing again. After successfully completing the internship I found a job with Examiner.com that allowed me to spend a portion of my time improving and maintaining the automated testing system and roll out the initial work for contributed project testing and a number of other improvements in ATS (PIFR and PIFT) 2.2. The contributed project testing with dependencies was labeled beta because it did not support specific versions and had known issues. The plan was to make a followup release to solve the issues.

Code freeze and the request to stop

After deploying PIFR 2.2, I was asked to stop making changes to the testbot to ensure stability of the testing system during the final stages of Drupal 7 development. I continued to make improvements that I planned to deploy once the freeze was lifted, but the short freeze turned into months and more months. This delay ultimately forced me to stop development before the codebase diverged too much from the active testbot.

PIFR and PIFT commit log
The chart shows my combined commit activity for PIFR and PIFT and indicates the dramatic slowdown that occurred as a result of the freeze placed on the testbot.

During this time I was the only person who worked on the testbot in any significant capacity (or virtually at all). My availability for working on testing dwindled when my time with Examiner ended. This, combined with the stagnation forced upon the testbot, meant things simply ceased moving forward. The complete stagnation is seen in the long period of time between the 2.2 release and the 2.3 release of PIFR on January 28, 2010 and March 28, 2011, respectively. During that entire period of more than a year no changes were made to the testbot. When changes were finally made, they were done merely out of necessity to accommodate the git migration.

Post-freeze undeployed features

Shortly after the 2.2 release I completed a number of improvements before things came to a stand-still. Some of the recent deployments have included functionality that I had completed, most notably:

  • Version control system abstraction and plugins for bazaar, cvs, git, and svn
  • Coder reviews in addition to testing
  • Beta support for contributed project testing with dependencies

Recent changes

As mentioned above, I had already abstracted the version control handling in the testbot and had four plugins (bazaar, cvs, git, and svn). Unfortunately, there were a number of assumptions that had to be made due to limitations with the project module's VCS integration. These assumptions had to be updated for the shiny new version control API. The changes required were very minor and did not represent any feature improvements, but were simply part of the changes necessary to complete the git migration. Randy Fay made the necessary changes and the testbot saw its first update in a very long time. A few small followups were released as part of the planned phasing out of the old patch format and such. It is interesting to note the other major components of the Drupal.org migration were contracted by the Drupal Association except the automated testing system.

Jeremy Thorson has recently been working on using the testbot's ability to perform coder reviews to help solve the woefully broken project application process which he describes in several blog posts. Again we see change coming to the testbot out of necessity rather than a focused plan for improvement. For those not aware of it, the project application queue has several hundred applications and it takes months to even receive a review. Jeremy has worked hard on improving the application process, at the heart of which is the ability to perform automated coder reviews. Providing automated reviews has been held back on multiple fronts not the least of which is finding people to get things done. This is a definite hurdle considering that only three people have every worked on the testbot code itself not to mention there is an average of less than one active maintainer at any give time.

As mentioned above, I had deployed the first stage of contributed project testing over a year ago, but was forced to shelve the follow-up deployments. The code to properly handle module dependencies fell into disarray with the git migration and required refactoring to work with the version control API. Derek Wright and I spent a lot of time hashing out the details to ensure things were properly abstracted for the project module. I completed the code, but it was never committed and thus was not maintained through the migration. Randy took it upon himself to update the code, but deviated from the agreed upon design. This choice meant the code would not be included in the project module and has a number of other ramifications. The feature was rebuilt in a drupal.org specific manner that precludes others from taking advantage of the code and eliminates the possibility of exposing the data through the update XML information. Exposing the data in that fashion would mean projects like drush, drush make, Aegir and others could discard code written to recreate this data or would now be able to support proper dependency handling. In addition, the recent deployment of dependency handling has led to large delays and instability in the testbot.

Conclusion

The decision to freeze the testbot in conjunction with the Drupal 7 code freeze made sense at the time. However, the extended freeze of the testbot (due to the extended Drupal 7 code freeze) along with moving SimpleTest into core had the unintended and disappointing side effect of causing the effective stagnation of the testing system. The only changes to the testbot in the past 20 months have been made out of necessity and annoyance (the git migration and the unfinished testbot integration with the project application process for new developers). During my tenure with Examiner.com, a fair number of changes were made to the testing system but not deployed on drupal.org. The module dependency code had been written over a year ago and finalized shortly thereafter but languished and was never deployed. Recently, some of these changes were finally deployed along with the git migration. All the while, I had set forth a detailed roadmap for the testing system.

The testing system had been stable and running for 3 years. Recent changes (implemented by others) have resulted in the ups and downs of the testing system. The importance of testing to Drupal development coupled with the recent instability strongly suggests the testing system requires full-time attention. The lack of feature changes since the 2.2 release of PIFR in January, 2010 is a direct result of a lack of financial testing resources, the lock-down of the testing system components, the burnout caused by extreme difficulty to make changes, and the extended freeze placed on the testbot.

Various solutions were tried to enable the continuation of work on the testbot. None represented a viable long-term solution. In the end, my father and I decided the solution was to establish a business to advance testing for the Drupal community and to create an environment where we no longer have our hands tied behind our back. In the next post, I will share the vision and passion we have for testing along with several features that could be made available to the community immediately.

Comments

Thanks for sharing your thoughts. The Drupal community needs to take testing seriously and it needs to be easier for the devoted people to get their patches into the code base.

I am curious for your new initiative ! Very good of you to bend this your way. Thank you for staying with us. Drupal needs people like you.

several hundred applications and it takes months to even receive a review

There are currently 250 in a "needs review" state, so depending on your definition of "several" that is true.
I don't think the claim that it takes "months" to receive a review is anywhere close to valid. In some cases it has taken weeks. But I think you'd be hard-pressed to find examples that sat for months with no review, and when you do they are usually complex modules that require external dependencies or specific knowledge to test.

Jthorson indicates on his blog it took months and I have seen plenty of people saying the same thing in IRC, but no doubt it is not always so.

To clarify, it is often months to get through the entire process ... up to a year in the case of some folks who didn't follow the 'migration to git' instructions posted on their applications. But this is different than the time to a 'first' review, which seemed to be averaging around 4 weeks when I applied; and the longest was 9 weeks the last time I performed a spot-check. Sorry if I was misleading. :)

@jthorson: Thanks for the clarification.

@others: Please keep in mind the primary point was things could be much better with automation and the only reason we don't have it is due to lack of resources. One of many, the next post will hit on this so more.

I suspect that you are always going to have a hard time finding maintainers for PIFR and simpletest, mainly because you are always going to find it hard to find someone willing to pay for testing, especially if you're not actually testing the finished product, but the framework. I.e. there's no way any of our clients would pay for us to help maintain PIFR so that their site built on Drupal was better. Maybe this is one case where the Drupal Association really does need to get involved financially, but I'd imagine that the DA has less money than everyone would like it too, and it might be too close to getting involved with development of core anyway.

Additionally, I wonder if PIFR is doomed to be unmaintainable, as most people doing testing won't want to write or use their own testing framework, they'll use something like Jenkins instead. If PIFR was a lightweight interface with some external tool, then you might find you'd get much more traction in the community, and many more people willing to help maintain the code you write. One of the main reasons that git was adopted by the Drupal community over other SCM tools was because the community seemed to go nuts whenever someone mentioned git. Maybe you need to find out how PIFR can become more useful for the Drupal community as a whole, and then re-work it light of that, and actually get somewhere other than d.o using it, and then you might find some maintainers.

Thank you for you hard work though, undoubtedly testing has made Drupal 7 a better release than ever before, and your efforts are highly valued by the community. I think we all want to see development of testing Drupal continue, and in a sustainable fashion.

That's a very good question.

Is anyone other than drupal.org using PIFR for automated testing?

If not, why not? Should PIFR be modified so it is appealing to those organizations so they have a reason to maintain it? Should drupal.org move to the system (Jenkins, or whatever) that those organizations use?

I hope all these questions and more will be answered in post #2 :)

Actually, there are a few companies a know of since they asked me to make specific changes. Examiner, as I mentioned in the post, hired me in part to work on PIFR since they are using it. The second post will provide more details on why Jenkins is not a good (or option at all) option for drupal.org.