Parallelise nightly tests across multiple processes

### User Story

As a developer, I want feedback on the releasability of the system as soon as possible, so that I can fix problems earlier and more easily, and so that I have a record of its behaviour over time.

### Description / Background

Split from:
- https://github.com/gchq/sleeper/issues/1794

Currently the nightly tests start at 3am, and there are two versions of it. The nightly functional test suite excludes the most expensive tests that work with a large bulk of data. The performance test suite includes all other system tests, and also includes more expensive tests that take more time and work with larger amounts of data.

Currently the performance test suite runs three times a week, on Mondays, Wednesdays and Fridays. Every other day is just the nightly functional test suite. The performance test runs from 3am until 4pm if it all passes. If anything fails there's polling that happens to wait for it to finish before the test actually fails, so the test suite could take much longer.

We can improve on this quite quickly by grouping the long running tests into chunks, and run each chunk of tests in its own process. We'd like to make as much improvement as we can by parallelising in chunks that can each run in a separate process, i.e. a separate invocation of Maven.

### Acceptance Criteria

We need to parallelise this to some degree so we don't run out of hours in the day. We'd really like to have the results of the tests around 9am for the start of work, or soon after.

### Technical Notes / Implementation Details

The majority of the functional tests run in a single, main instance of Sleeper, but run in such a way that they could occur in parallel. Some of the nightly functional tests run in their own instance, and each performance test runs in its own instance. For the tests that run in their own instance, it would be quite easy to run them separately in another process, and parallelise them that way. For tests that run in the same instance, there's some shared state for deploying and tracking the instance. We would need to either parallelise that in-process, or deploy the instance in a separate process before we start the processes to run the tests.

For now we can just do the easiest thing, which is to take the few tests that take a long time to run, but run in their own instances, group them into larger chunks, and run several chunks simultaneously in different processes.

#### Defining chunks

Currently these different types of tests are grouped into test suites in the system-test-suite module: NightlyFunctionalSystemTestSuite, NightlyPerformanceSystemTestSuite, QuickSystemTestSuite, in the package `sleeper.systemtest.suite.suites`. These work with annotations on the different tests, i.e. Expensive, Slow, in the package `sleeper.systemtest.suite.testutil`. We can adjust this to produce test suites for the different chunks.

We used the JUnit Platform Suite Engine for this:
https://docs.junit.org/current/user-guide/#junit-platform-suite-engine

#### Scripting changes

The current test suites are invoked in `scripts/test/nightly/runTests.sh`, which just works with NightlyPerformanceSystemTestSuite and NightlyFunctionalSystemTestSuite. We could adjust this script to run chunks associated with the performance/functional tests in parallel.

Note that in the past `scripts/test/nightly/runTests.sh` also ran the quick test suite but with the DynamoDB state store. This might help for an example.

#### Feedback on scripting changes

We'd like to be able to make changes to the `runTests.sh` script without needing to re-run the whole test suite to know if it worked. We could set an option for it to replace the calls to Maven with some other output we can check to see what it did.

#### Test output

The script `scripts/test/nightly/runTests.sh` creates a directory for the test output, and establishes separate directories under that for different types of test suite. Currently this is just one, either a functional test suite or a performance test suite, but previously it also included the DynamoDB state store test output.

This currently waits until all Maven invocations finish before calling into Java with RecordNightlyTestOutput to upload the results to S3.

If we run the Maven invocations in parallel, we will need to either wait for all of them to finish, or upload the results of each of them to S3 as they finish, in their separate processes. If we upload to S3 as they finish, this would cause a problem with the lambda that automatically stops the EC2 when the tests finish, as it stops the EC2 once the output is present in S3. It would also cause the problem with parallel processes updating the same data in S3. We probably need to wait until the end to do this.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Parallelise nightly tests across multiple processes #5688

User Story

Description / Background

Acceptance Criteria

Technical Notes / Implementation Details

Defining chunks

Scripting changes

Feedback on scripting changes

Test output

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Parallelise nightly tests across multiple processes #5688

Description

User Story

Description / Background

Acceptance Criteria

Technical Notes / Implementation Details

Defining chunks

Scripting changes

Feedback on scripting changes

Test output

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions