qa.drupal.org

Acquia internship

Acquia logo

This Summer I will be working part-time as an intern for Acquia. I am very excited to be working with Acquia and having the chance to spend more time improving things that I have interest in. To clarify I will be working on projects that benefit the entire Drupal community. The items I will be working on are improvements to projects I have either started or that I am heavily involved with.

During the discussion of the internship I came up with the following goals that were then prioritized by Dries.

Primary goals

  • Finalize testing of contributed modules and Drupal 6.x projects/core.
  • Add executive summary of test results on project page.
  • Extend the SimpleTest framework so we can test the installer and update/upgrade system.
  • Improve and organize SimpleTest documentation
  • Work on general enhancement of Drupal 7 SimpleTest.

Secondary goals

  • Provide on-demand patch testing environment for human review of patches.
  • Finish refactoring of SimpleTest to allow for a clean implementation of "configuration" testing.
  • Analyze current test quality and code coverage, and foster work in areas requiring attention.

I will post updates on some of the more interesting items as they are accomplished. Additionally, I would like to give a special thanks to Kieran Lal for his mentoring and help in finding me sponsorship.

FOLLOW UP:
To clarify I will still be participating in Google Summer of Code 2009, which was explicit in my agreement with Acquia.
Follow up post by Dries.

Automated Testing System - Statistics

I decided to pull gather some statistics about the automated testing system. These statistics were collected on Wednesday, May 6, 2009 at 4:00 AM GMT. Automatic generation of these statistics along with analysis is a feature I have in mind for ATS 2.0. I appreciate donations to the chipin (right), as this project requires a lot of development time.

From the data you can see that the test slaves have been running tests for the equivalent of 200 days. The system has been running for 192 days and not all the data was included since some of it is inaccurate. That means the system has saved 200 days of developer's time! It is clear that the ATS is a vital part of test driven development. Additionally the time that would have been spent fixing regressions and new bugs has been drastically lowered.

Item Function Value
Time testing SUM 17,310,047 seconds (~288,500 minutes, ~200 days)
Test run (test suite) COUNT 42,351
MAX 3620 seconds (~60 minutes)
MIN 17 seconds
AVG 804 seconds (~ 13 minutes)
STDDEV_POP 783 seconds (~13 minutes)
Test (patch, times tested) COUNT 6,953
MAX 86
MIN 1
AVG 10
STDDEV_POP 15
Test pass count MAX 11,453
MIN 0
AVG 4,265
STDDEV_POP 4,910
Test fail count MAX 6,989
MIN 0
AVG 9
STDDEV_POP 155
Test exception count MAX 813,795
MIN 0
AVG 160
STDDEV_POP 9,893

One item you may notice is the maximum test exception count of 813,795. The patch that caused that many exceptions proved that our system is scalable! The patch is much appreciated. :)

Saved the current test result breakdown.

Result distribution

The average test run length for all the active test slaves can be seen below. This data is only looking at the latest test run for each patch in the system.

Test slave Average test length*
4 730 seconds
5 1,753 seconds
7 1,352 seconds
8 576 seconds
9 2,438 seconds
10 1,942 seconds
12 1,161 seconds
16 217 seconds

* Excludes test runs that do not pass initial checks and fail before running test suite.

Average Test Length

Automated Testing System 2.0 - New Features - Part 1

Over the next few weeks I plan to make a number of posts about the new features provided by ATS 2.0 and the benefits to the community. Currently, the system is in the final stages of deployment, but is not yet active. Please be aware that these features will be available once ATS 2.0 has been deployed. I appreciate donations to the chipin (right), as this project requires a lot of development time.

Server management

One of the major restraints holding back the expansion of the system has been the need to manually oversee the array of testing servers. The new system contains a number of enhancements to make it not only easier to manage the network, but also automates the task of adding new clients.

Client enable process
Upon request to enable client a set of error cases are sent to the client with expected results.

The most important addition that makes all this possible is the automatic client testing. Clients are automatically tested to ensure they are functioning properly. This is done through a set of tests that are sent to each test client with an expected result. The results the client sends back and compared with the expected result and that information is used to determine if the client is functioning properly. Clients are tested on a regular basis to ensure that they continue functioning as expected.

Another helpful change has been re-working the underlying architecture to use a pull based protocol instead of a push based protocol. This alleviates the issues caused when a client is unreachable for a period of time, or is removed without notice.

Public server queue

Add client
Very simple screen that allows users to add a test client to the network.

Another improvement that will facility a larger testing network is the public server queue. Allowing anyone to add a server to the network is possible since the clients are automatically tested as described above.

The interface has been designed so that users may control the set of machines that they have added to the network. The system automatically assigns the client a key that must be stored on the client and is used for authentication. The process of adding a client to the master list is very simple and should provide an easy way for users to donate servers.

If the system detects any issues with the client down the line, such as becoming out of date, it will notify the server administrator of the problem and disable the test client. The system will continually re-test the client and re-enable it automatically if it passes inspection. Alternatively, the server administrator may request the client to be tests immediately after fixing the issue.

Multiple database support

The new system has been abstracted to allow for the support of PostgreSQL and SQLite in addition to MySQL. This is vital to ensure that the Drupal 7 properly supports all three databases. Just as patches are not committed until they pass all the tests, patches will not be committed until after passing all the test on all three databases (5 environments with the database variations).

Automated Testing System 2.0 - Final Steps

During the last several months I put a substantial amount of work into improving the Automated Testing System. Future posts will describe the exciting new features and the benefits to the community. If interested, a brief overview of some of the requirements can be found in the PIFR and PIFT issue queues.

For additional background, the original thoughts can be found at the following links:

Final steps
There a number of steps that need to be completed before the Drupal community can reap the benefits of the new system.

  1. Security review of rewritten Project Issue File Test (PIFT) module that integrates with the project module on drupal.org.
  2. Someone familiar with SQLite, possibly one of the D7 maintainers, needs to write a PIFR DB driver to implement the required methods. MySQL and PostgresSQL have already been completed and can be used as examples. The driver is relatively simple, but will require manually connecting to SQLite since PIFR runs in D6 which does not support SQLite.
  3. Update testing client setup/installation script where necessary.
  4. Deploy current development system to project.drupal.org and the create a parallel testing client network.
  5. Freeze the current test client network and extract the test ID map for use in drupal.org upgrade.
  6. Upgrade and finalize test client network and test server (testing.drupal.org). Possibly move testing.drupal.org under the drupal.org infrastructure.
  7. Confirm upgraded testing network is functional.
  8. Plan for approximately 15 minutes of downtime on drupal.org.
  9. Update PIFT code and run data update using extracted test ID map from #4 on drupal.org during downtime.
  10. Watch deployed system closely and solicit community feedback and bug reports.
  11. Request additional hardware to use as community test clients (to allow for future expansion into testing contributed modules).

Future
Once the second generation framework is in place and running smoothly I will begin work on finishing the last pieces required to allow for testing of contributed modules (D6 and D7) and Drupal 6 core. I will be writing more on the new features and UX improvements to be looking for in during the upcoming deployment.

Drupalcon DC - Automated testing - Saving webchick time - the saga

I will be presenting Saving webchick time - the saga along with Kieran Lal at Drupalcon DC 2009. To quote the session abstract:

One of the major enhancements made to the Drupal development cycle has been the addition of a fully automated testing bot, built on the testing framework in Drupal 7.

This session will focus on automated testing as it relates to Drupal 7: its history and direction, the automated testing bot: what has gone into it and where the future leads, and most importantly what is the end gain to the Drupal community.

The framework has gone through a rather long and interesting history with a number of road-blocks and challenges that have been overcome. The session will tell the story of the framework and the benefits it provides to the community through the enhanced Drupal 7 development work-flow. Although the session will go into some technical details it will overall tell the story of the framework and where we plan to take it. The presentation should be interesting to most and provide a great time to throw out any comments or concerns.

After the presentation I will stick around to discuss our recent launch of the Boombatower Testing Service (BTS). If you have any questions, comments, or concerns please drop by.

Automated Testing System Development Clarification

After my most recent post concerning the Testbed design change and development I received a number of comments that lead me to believe that the post was misunderstood. The comments suggest that the readers believe my primary focus of development will be changing from a push to a pull model. The post seems to have been misleading in that regard.

The change to a pull architecture is a very minor change in terms of coding, only 2 functions are even effected, but in terms of managing the network it helps out a lot. The majority of the development, that I am raising $3000 for, will be focused on implementing the ideas described in The Future of Automated Patch Testing. The feature additions will:

  • Give more control to server administrators (donated servers)
  • Make it easier to manage the automated testing framework
  • Allow for testing of multiple environment configurations
  • Open up testing of Drupal 6 core and contrib code.

I hope this has cleared up any misconceptions with the automated testing system as it stands and my plans for its feature.

I appreciate any donations.

Testbed design change and development

Please see clarification.

The next goal for the testing system (centered at testing.drupal.org) is to test Drupal 6, Drupal 6 contrib, and eventually Drupal 7 contrib. This will require:

  • Additional testing computers to be added to the current fleet of 4-6 servers which handles the Drupal 7 core load
  • Re-architecting the automated testing system as described below
  • The addition of new features described in my previous post, The Future of Automated Patch Testing

In order for me to be able to devote significant resources in the short term to get this completed it would very helpful to have funding. I would like to raise $3,000 for the development of the new system.

Before describing the details of the new system, lets look at how the current system works.

Current system

System flow

The automated testing system has three stages, or sections, to it. The sections are illustrated in the figure to the right, but I'll give an overview of what they do.

  1. The project server that manages all the issues related to projects. - drupal.org
  2. The testing master server which distributes the testing load and aggregates the results to the appropriate project server. - http:testing.drupal.org
  3. The testing server that performs the patch review and reports back to the master server. - network of servers

New architecture

The automated testing system is classified as a push architecture. That means that the testing master server pushes (ie. sends) patches to each of the testing servers. The new architecture will be based on a client/server model where the test client pull (ie. request) patches to test from the server.

In addition to changing the basic architecture of the system the large feature list will also be implemented. After all is said and done the new system will provide a more powerful tool to the Drupal community.

Reasoning

The motivation for making the change is due to the difficulties we have run into while maintaining the system for the last few weeks.

  • Adding a testing server requires the entire system to be put on "hold".
  • There is not an automated way to confirm that a testing server is functioning properly.
  • When a testing server becomes "messed up" or "un-reachable" the system does not automatically react.
  • Server administrators cannot simply remove their server from the fleet; they must first inform us that they are doing so.

All this manual intervention adds up to a large amount of time that Chad "hunmonk" Phillips and I have to spend to keep the system running smoothly. As we look to add more servers our time commitment will increase to the point where it is not feasible to maintain the system, thus the changes are required.

The Future of Automated Patch Testing

Over two years of development has lead to testing.drupal.org becoming a reality. The testbed has been active for almost two months with virtually no issues related to the automated testing framework. That is not to say the bot has not needed to be disabled, but instead that the issues were unrelated to the automated testing framework itself.

The stability of the framework has lead to the addition of over 6 testing servers to the network, with more in the works. Increasing the number of testing servers also means an increased load required to manage the testing network. A fair amount of labor is needed to keep the testing system running smoothly and to watch for Drupal 7 core bug introductions.

Having the testing framework in place has saved patch reviewers, specifically the Drupal 7 core maintainers: Dries and webchick, countless hours that would otherwise have been spent running tests. The automated framework also ensure that tests are run against every patch before it is committed. Ensuring tests are always run has lead to a relatively stable Drupal 7 core that receives updates frequently.

Based on the overwhelming positive feedback and personal evaluation of the framework a number of improvements have been made since its deployment. The enhancements have, of course, lead to further ideas which will make the automated testing system much more powerful than it already is.

Server management

A number of additions have been made to make it easier to manage the testing fleet. The major issue that remains is that of confirming that a testing slave returns the proper results. The process currently requires manual testing and confirmation of the results. The process could be streamlined by adding a facility that would send a number of patches with known results to the testing slave in question. The results returned would then be compared to the expected results.

The system could be used at multiple stages during the testing process. Initially when a testing slave is added to the network it would be automatically confirmed or denied. Once confirmed the server would begin receiving patches for testing. Periodically the server would then be re-tested to ensure that it is still functioning. If not the system would automatically disable the server and notify the related server administrator.

Having this sort of functionality opens up some exciting possibilities as described bellow.

Public server queue

Once the automated server management framework is in place the process of adding servers to the fleet could be exposed to the public. A server donor could create an account on testing.drupal.org, enter their server information, and check back for the results of the automated testing. If errors occur the server administrator would be given common solutions in addition to the debugging facilities already provided by the system.

If the server passes inspection an administrator of testing.drupal.org would be notified. The administrator would then confirm the server donation and add the server to the fleet. Once in the fleet the server would be tested regularly like all the other servers. If the donor wishes to remove their server from the fleet they would request removal of their server on testing.drupal.org. The framework would let any tests finish running and then remove the server automatically.

A system of this kind would provide a powerful and easy way to increase the testing fleet with minimal burden to the testing.drupal.org administrators. Having a larger fleet has a number of benefits that will be discussed further.

Automated core error detection

Automatically testing an un-patched Drupal HEAD checkout after each commit and confirming that all tests pass would ensure that any human mistakes made during the commit process do not cause false test results. In addition to testing the core itself the detection algorithm would also disable the testing framework if drupal.org is unavailable. Currently when drupal.org goes down the testing framework continues to run which causes errors due to the patches not being accessible. Having this sort of system in place would be a great time saver for administrators and ensure that the results are always accurate.

There is currently code in place for this, but it needs expanding and testing.

Multiple database testing

Drupal 7 currently supports three databases and there are plans to support more. Testing patches on each of the databases is crucial to ensure that no database is neglected. Creating such as system would require a few minimal changes to the testing framework to store results related to a particular database, send a patch to be tested on each particular database, and display the results in a logical manor on drupal.org.

Patch reviewing environment

In addition to performing patch testing the framework could also be used to lower the barrier required to review a patch. Instead of having to apply a patch to a local development environment a reviewer would simply be required to press a button on testing.drupal.org after which he/she would be logged into an automatically setup environment with the patch applied.

This sort of system would save reviewers time and would make it much easier for non-developers to review patches, especially for usability issues.

Code coverage reports

Drupal 7 strives to have full test coverage. What that means is that the tests check almost every part of the Drupal core to ensure that every works as intended. It is rather difficult to gage the degree to which core is covered without the use of a code coverage reporting utility. Setting up a utility of that kind is no small task and getting results requires large amounts of CPU time.

The testing framework could be extended to automatically provide code coverage reports on a nightly basis. The reports can then be used, as they have been already, to come up with a plan for writing additional tests to fill the gaps.

Performance ranking

Since the tests are very CPU intensive having a good idea of the performance of a particular testing slave would be useful for ordering which servers are sent patches first. Ensuring that patches are always sent the fastest available testing server will ensure the quickest turn-around of results. The testing framework could automatically collect performance data and use an average to rank the testing server.

Standard VM

Creating a standard virtual machine would have a number of benefits: 1) eliminate most configuration issues, 2) provide consistent results, 3) make the processing of setting up a testing slave easier, and 4) make it possible for one testing server to test patches on different databases. Several virtual machines are currently in the works, but a standard one has yet to be agreed upon.

Benefits

Drupal is somewhat unique in having an automated system like the one in place. The system has already proven to be a beneficial tool and with the addition of these enhancements it will become a more integral part of Drupal development and quality assurance. Maintaining the system will be much easier, reviewing core patches will be simpler, and the testing fleet can be increased in size much more easily.

With a larger testing fleet the testing framework can be expanded to test contributed modules. In addition the framework can be modified to support testing of Drupal 6 and thus enable it to test the large number of contributed modules that have tests written. Having such a powerful tool available to contrib developers will hopefully motivate more developers to write tests for their modules and in doing so increase the stability of contributed code base.

The automated testing framework is just beginning its life-cycle and has already proven its worth, with enhancements like the ones discussed above the framework can continue to provide new tools to the Drupal community.

Keep core clean: issues will be marked CNW by testing framework

The final updates to the testing framework have been placed on drupal.org. The updates will allow issues to be marked as code needs work when the latest eligible patch fails testing.

After a bit of work fixing the latest bugs in core in order to make all tests pass, the testing system has been reset and should now be reporting back to drupal.org again, but this time marking issues as CNW!

Current Drupal 7 issue queue status:

267 Pending bugs (D7)
241 Critical issues (D7)
1116 Patch queue (D7)
388 Patches to review (D7) 

I hope to see some changes.

On the other side this means if a bug is committed to core that causes the tests to fail all patches tested will be marked as CNW.

Make sure you wait for testing results on a patch before you mark it as ready to be committed.

Testing results on drupal.org!

Today testing.drupal.org began sending results back to drupal.org! The results are displayed under the file they relate to. The reporting will not trigger a change in issue status if the testing fails, as is the future plan. That will be turned on when the automated testing framework has proven to be stable.

The following is an example of what a pass result look like on drupal.org. (http://drupal.org/node/180379#comment-739254)
pass example

The following is an example of what a fail result look like on drupal.org. (http://drupal.org/node/200185#comment-752032)
fail example

Testing.drupal.org also provides detailed stats on patch results and status. An example of patch result stats is as follows.
stats example

Early success
Before the actual reporting was turned on the automated testing framework discovered two core issues. One was a partial commit that caused a PHP syntax error. The other was a bug with the installer. I hope to see the results make patch reviewing much faster and easier.

Your help
Please report any test result anomalies in the PIFR queue.

Pages

Subscribe to RSS - qa.drupal.org