All software (libraries, scripts, applications, utilities) developed within the GO consortium should follow these principles
- 1 Automated Testing
- 2 Manual Testing
- 3 Software Lifecycle
- 4 Documentation
- all GO software must have an extensive test suite
- all cvs/svn commits must pass the test suite
- new capabilities must be accompanied by tests
- all releases absolutely must pass every test
cd go-perl perl Makefile.PL make test
Note that go-db-perl test suite requires presence of a writeable database; if not present the suite will automatically pass. This facilitates simple CPAN installs.
We need to do more to ensure the tests are run regularly. See: this tracker item
This is somewhat ad-hoc. go-db-perl tests isolated parts of database building, but not the whole pipeline.
There is a mini-pipeline test, which tests a full build with the first 10k lines from every gene-association file:
cd go-dev/database/test make
AFAIK this is not run regulary.
Whilst there is a test suite of the pipeline software, we have relatively little in the way of checking the contents of the built database. There may be content errors even if the build software is perfect (perhaps due to an upstream content or processing error)
The go-prepare-release script does a minimal amount of checking as it progresses, using GO::Admin->guess_release_type. However, this check is not strong enough to halt the release - instead failures are emailed to the central admin email account (is this checked?)
We need more checks. Failure of these checks should halt the release and force manual intervention. The checks would be executed via go-prepare-release. They could be implemented in perl, SQL views, or a mixture of both
The checks include, but are not limited to:
- Foreign Key Integrity
The new bulkloading script introduced database integrity errors.
These could be avoided altogether if we use InnoDB rather than MyISAM
TODO: determine feasibility
If we can't use InnoDB, then we can generate SQL Views that check for integrity. Presumably there is a way to generate these automatically from the source schema.
- Content checks
We can assign minimum numbers for various parameters and automatically check
- > 20k terms
- > n associations per reference genome
- > n sequences per reference genome
Gene Products all gene products MUST have the following attributes:
- a valid type (gene_product.type_id=term.id)
- valid pecies (gene_product.species_id=species.id)
- valid dbxref (gene_product.dbxref_id=dbxref.id)
and the values of these attributes must not be null
Associations All associations MUST have the following attributes:
- a valid term (association.term_id=term.id, additional check that term is valid?)
- valid gene product (association.gene_product_id=gene_product.id)
- valid database (association.source_db_id=db.id)
Derived files and release pipeline
There are a number of files checked into GO CVS which are derived. AFAIK there is no automated tests for these. Automated tests are difficult since the scripts by nature modify the publicly available CVS.
Some of these scripts use obo2obo. TODO - all such pipeline calls must be accompanied by a JUnit test in OBOEdit. E.g. conversion from obof1.2 to obof1.0.
There is currently no automated testing for the UI. This is quite difficult to do.
Web-apps are slightly easier to automatically test than standalone apps. We still need manual testing to see if everything looks OK etc. However, we can still do link checking and flow-of-control checking.
- TODO - Seth, fill in hammer details here.
For example, for any linkouts to AmiGO, we should add these to the test suite and make sure they return valid html that contains the relevant information
Automated testing form the first line of defense. Manual testing is still required, particularly for end-user apps.
Manual and automated testing can be intertwined. E.g. A link checker can pre-load a bunch of URLs into a single web page making it much easier for testers to look them over.
A discussion of what to test for and how can be found at
AmiGO : manual tests
OBO-Edit : manual tests
- TODO link
- All software should be accompanied by a bug/request tracker Software_and_Utilities#Trackers
- TODO - write up. Feature freezes. Beta releases. Version numbers.
- Perl modules should have POD documentation -- this will show up when released on CPAN for example; e.g Graph module
- Java should have javadoc & this should be auto-published
Code should obviously be commented.
Scripts should show USAGE info when called with no args or -h