A denormalised warehouse for multiple LIMS.
Populating a table in the warehouse is done asynchronously. This project is a Railtie application that facilitate asynchronous database population through a message queue. This application listens to a message queue, consumes the message, converts the message into a database model, and persists in the corresponding table.
Take the example below.
{
"lims": "traction",
"aliquot": {
"id_lims": "LIMS123456",
"lims_uuid": "550e8400-e29b-41d4-a716-446655440000",
"aliquot_type": "DNA",
"source_type": "Blood",
"source_barcode": "SRC123456",
"sample_name": "SampleA",
"used_by_type": "Research",
"used_by_barcode": "USR123456",
"volume": 50,
"concentration": 200,
"last_updated": "2024-07-09T10:15:30Z",
"recorded_at": "2024-07-09T09:00:00Z",
"created_at": "2024-07-08T08:00:00Z",
"insert_size": 350
}
}
If the message above is consumed by unified_warehouse
, it will create a new record in aliquot
table with the values set to the given attributes. If an exception occur, it will dead-letter the message. For message serialization logic, please follow lib/payload.rb
.
-
MySQL (currently 8.0) is required and usually installed with homebrew:
brew install [email protected] brew link [email protected] --force
-
Clone the git repository
-
Install the relevant ruby from
.ruby-version
Rbenv will read .ruby-version automatically
rbenv install
-
Run the setup process:
3.1. Set up your MySQL root password inside the file config/database.yml
3.2. Runbin/setup
NB: If getting an error while installing the mysql2
gem, try:
bundle config build.mysql2 --with-opt-dir=$(brew --prefix openssl)
if that doesn't work, try:
gem install mysql2 -- \
--with-mysql-lib=/Users/your_user/homebrew/Cellar/mysql/your_version/lib \
--with-mysql-dir=/Users/your_user/homebrew/Cellar/mysql/your_version \
--with-mysql-config=/Users/your_user/homebrew/Cellar/mysql/your_version/bin/mysql_config \
--with-mysql-include=/Users/your_user/homebrew/Cellar/mysql/your_version/include
and try runnning bundle install
again.
Before you can use the system in any capacity, you must first prepare the database.
This should be handled by bin/setup
above, but if not:
bundle exec rake db:setup
This project provides the view cherrypicked_samples
that links data with
the event warehouse. To create the view you need to run the command:
bundle exec rake db:views:schema:load
Ensure the test suite is running and passing:
bundle exec rspec
-
Initialize the integration tests setup for events warehouse (please check the Integration Tests setup section at https://github.com/sanger/event_warehouse/#integration-tests-setup)
-
Reset the database
bundle exec rake db:reset
-
Create the dependent views
bundle exec rake db:views:schema:load
These actions can also be performed automatically if you run the Docker container of the service and pass the environment variables:
RAILS_ENV="test"
INTEGRATION_TEST_SETUP="true"
-
Run the integration tests:
bundle exec rspec --tag integration
Execute the worker to pick up messages in the queue and process them into the database:
bundle exec warren consumer start
The consumer will run in the foreground, logging to the console. You can stop it with Ctrl-C.
For more warren actions, either use bundle exec warren help
or see the
warren documentation
The number of worker threads can be configured for the consumer in
warren_consumers.yml
. This setting is applied to the channel and affects how
messages from the subscription on the queue are processed. Setting this value
to one uses a single thread and, therefore, a single writer to the database. If
only a few tables are written sequentially, a single worker has the advantage
of avoiding lock contention. If not configured, the default value is 3.
RabbitMQ is essential for this process, so if you haven't already, install it using:
brew install rabbitmq
brew services start rabbitmq
You can now view the instance running at http://localhost:15672/.
You may wish to start the warren consumers with:
bundle exec warren consumer start --path='config/warren_traction_service_dev.yml'
This will adjust the configuration options to be compatible with those suggested in the traction setup.
Also see managing custom configs
It is possible to run the consumers with a custom configuration, eg.
bundle exec warren consumer start --path='config/my_customized_config.local.yml'
The .gitignore
file will automatically prevent these configurations from being committed.
If you receive an error about a missing output file under tmp/pid/
it may be that you need to create this directory manually.
Once the directory above has been inserted at the root of the repository, the error should go away.
COG-UK Ids - These ids are given to positive samples imported through the Lighthouse-UI. This process should automatically record those Ids in the sample table, and also into the lighthouse_sample table. To migrate COG-UK Ids into the lighthouse_sample table manually via SQL, see this Confluence page.