Testbed design change and development

Please see clarification.

The next goal for the testing system (centered at testing.drupal.org) is to test Drupal 6, Drupal 6 contrib, and eventually Drupal 7 contrib. This will require:

  • Additional testing computers to be added to the current fleet of 4-6 servers which handles the Drupal 7 core load
  • Re-architecting the automated testing system as described below
  • The addition of new features described in my previous post, The Future of Automated Patch Testing

In order for me to be able to devote significant resources in the short term to get this completed it would very helpful to have funding. I would like to raise $3,000 for the development of the new system.

Before describing the details of the new system, lets look at how the current system works.

Current system

System flow

The automated testing system has three stages, or sections, to it. The sections are illustrated in the figure to the right, but I'll give an overview of what they do.

  1. The project server that manages all the issues related to projects. - drupal.org
  2. The testing master server which distributes the testing load and aggregates the results to the appropriate project server. - http:testing.drupal.org
  3. The testing server that performs the patch review and reports back to the master server. - network of servers

New architecture

The automated testing system is classified as a push architecture. That means that the testing master server pushes (ie. sends) patches to each of the testing servers. The new architecture will be based on a client/server model where the test client pull (ie. request) patches to test from the server.

In addition to changing the basic architecture of the system the large feature list will also be implemented. After all is said and done the new system will provide a more powerful tool to the Drupal community.

Reasoning

The motivation for making the change is due to the difficulties we have run into while maintaining the system for the last few weeks.

  • Adding a testing server requires the entire system to be put on "hold".
  • There is not an automated way to confirm that a testing server is functioning properly.
  • When a testing server becomes "messed up" or "un-reachable" the system does not automatically react.
  • Server administrators cannot simply remove their server from the fleet; they must first inform us that they are doing so.

All this manual intervention adds up to a large amount of time that Chad "hunmonk" Phillips and I have to spend to keep the system running smoothly. As we look to add more servers our time commitment will increase to the point where it is not feasible to maintain the system, thus the changes are required.

Comments

I understand each of the pitfalls, but i only understand how changing to a pull architecture would affect the last one.

A pull architecture fixes a few issues more subtly then directly. By not forcing t.d.o to check if the clients are free, but having the testing clients request patches eliminates timeout issues and things of that nature. It also opens up the future for things like BOINC.

The new architecture will also fix the put on "hold" issue as the client can request patches to test itself instead of forcing t.d.o to wait until the server is confirmed to resume operations. That will also benefit from automated client checking .

This post is a follow up to the previous post, meaning that this is one of the things that will be changed in addition to the others.

I'm wondering if there are existing projects that we can leverage? It might be a easier to take Buildbot or Tinderbox as the base environment and to write Drupal-specific scripts for it. For example, at Acquia, we use Buildbot to run Simpletests on a range of different machines/configurations. I'm not saying Buildbot is the right tool for drupal.org, but I'd be interested to learn what existing solutions you have evaluated.

Simpletestauto was the original code base that was developed over a two year period. It lost momentum and I came on the project to get it working.

I based on my work on the previous code base and the PHP infrastructure. The code I developed is significantly different and allows for a distributed system, but because of the existing work not much time was spent evaluating other solutions.

Currently we have a working solution that does most of the things a tool like Buildbot would do for us. The extra features I am describing would need to be built for Buildbot or any other tool so at this point it doesn't make a whole lot of sense to switch.

If haven't looked at your code yet, nor have I compared it with tools like Buildbot. But it might make a lot of sense to switch to Buildbot, if Buildbot is being maintained by more than 1 person. What happens the day you quit working on Drupal? As we've learned, maintaining infrastructure shouldn't be our core business. We should focus on the Drupal-specific bits.

Today I took some time to look at what BuildBot offers. From my evaluation I still believe developing our own framework is the best choice.

Reasons:

  • Much of the low level client/server handling is done by the xml-rpc implementation built into Drupal.
  • The BuildBot test results are organized by date and server environment which is opposite of what we are looking for. The t.d.o results are displayed by patch and will eventually be by database. The priority of Drupal test results is the test statuses, whereas the priority of BuildBot is date and server.
  • The automated test client confirmation code will have to be built on top of BuildBot anyway.
  • Patches need to be applied to the Drupal core being tested which I'm sure is possible with BuildBot, but not standard.
  • Eventually contrib modules will need to be installed into a fresh HEAD test contrib which also needs to be added on top.

The primary things that BuildBot offers don't apply directly to our situation. It would require a fair amount of modification to make it work. Since the primary code, low level client/server code, is already handled by xml-rpc some of BuildBot's code losses a number of its advantages. I think it would work much better for testing a vanilla HEAD checkout on different environments, as Acquia does, rather then testing a patched HEAD and prioritizing test results.

Although BuildBot can, as Acquia has shown, work for testing Drupal core I don't think it provides a very Drupal tailored system without significant modification. Based on that I think developing our own, as we did with testing framework, is the best route.

Here's how i would set this up

(I commented this on a different post, so re-commenting here
with a few edits)

1. Bring back simpletest from simpletest.org
2. Identify true unit tests and separate them from system tests
3. Port unit tests to use simpletest.org code
4. Install a continuous integration software system like buildbot
- write custom trigger for when to build should take place
- configure it to accept patches on a port sent from d.o.
- configure it to queue builds to an finite number of slaves
- configure it notify folks via IRC, email
- publish read-only web browser status
- write custom notifier to update d.o on patches status

5. Configure buildbot to run all unit tests on each patch
6. Configure buildbot to run system tests several times a day based on what's already been commited to CVS. Any failures, can be attributed to any/all folks that commited in the time window
7. Allow core maintainers to try any patch against full system tests at their will.

Notes:

yes, all test slaves should be virtual machines. Yes, because it's 1000 times easier, but also because patches may do unintentional bad things and you wouldn't want anyone volunteering a bare metal machine and have it get destroyed.

Writing your own system will be exactly what you want, it's just that you end up reinventing a lot of things. Leveraging other open source software is sometimes a comprise of functionality sometimes, but you gain so much more in the ways of documentation, bug fixes, new features, allowing other to collaborate. Maybe writing your own solution is better, but hopefully you can understand why folks are asking questions. Maybe your on the road to developing the next greatest continuous integration project for other projects.

I'm not a 100% sure what the simpletest.org stuff has to do with this, as we decided that unit testing core was virtually impossible and therefore would be postponed until the APIs were written in a way that lend themselves to unit testing. Regardless that is a testing system issue, not part of the automated testing framework.

The issue I see is that even after doing everything you described, which is the majority of what I have written anyway, I still end up with results that are not based on patches, but rather on dates and environments (also not very pretty :( ). I'm not opposed to using a pre-built system, but if I end up having to customize and override so much is it that useful? Maybe I'm missing something, but my current code base isn't that large and does a number of things that a basic testing system does not. In addition to that a number of the features I plan to add are out of the scope a something like BuildBot.

At this point I still do not see a real advantage is switching to another system due to the enormous amount of configuration required, especially considering that 90+% of the code I have written would make up those configuration files and this way I end up with a more beautiful Drupal based solution. I think they work great for confirming that code compiles on different environments after committing, but for something like patch testing requires too much modification/configuration/overriding.

A final note, everything you described gets us to where we are now (with the addition a numerous scripts to clean up the results into something presentable and tweaking). That doesn't add all the additional features that started this whole process anyway. From what I've seen, read, and pondered we'll end up with more configuration/custom scripts then with a pure Drupal solution.