Epic - Separate Donor/Specimen/Sample model from Analysis model #863
Labels
breaking-change
Changes to the code might result in breaking-changes
Epic
new-feature
Request is a new feature
refactor-analysis-data-model
Summary
Song should be usable for files which do not use the Donor/Specimen/Sample model. The current Analysis data model makes these fields a requirement for every submitted analysis, forcing data that does not use these fields to fill in these fields to satisfy the software. Additionally, when multiple analyses are submitted for a single donor, the donor information needs to be repeated for each of these analyses. This causes a duplication of data, and can even suffer from input errors from one analysis to another. There is an additional issue with Song data model for Donors/Specimen/Samples being limited and not customizable to the data that different systems wish to collect.
Overture is developing towards having separate services for tracking structured data, see Lectern and Lyric. These services provide the ability to track any data model for Donors, registration of their Specimen and Samples, and any other clinical or phenotypical data that is relevant to a study. This frees up Song to focus on being the service to track Analysis meta-data.
In order to connect the Analysis data with related structured data, a system can include a field in their Dynamic Schema which will provide an ID to link this analysis to the data tracked in lyric. This mapping becomes fully customizable through the Dynamic Schema definition. With this change, we will need to provide a mechanism for Song to check with an external service to validate that the provided value in one of these fields is registered with an external ID/data service.
Song as ID Service
One feature of Song that is lost by this change will be the use of Song to generate system wide unique IDs for Donors, Specimen, and Samples. Song has previously had the option to work as an ID server, generating unique IDs for these entities the first time an analysis that referenced them was submitted. As these entities will no longer be a standard part of the Analysis model, there is no mechanism for identifying Donors/Speciment/Samples and so they will no longer be tracked by Song's database, and no Donor/Specimen/Sample IDs will be generated by Song.
Most cases where Song has been used do not require this feature. A separate ID service has been used to create system wide IDs. This is either because of a federated data management model that has multiple Song instances, or because clinical data records are being registered and tracked separate from the file data. This will standardize this process for all future instances of Song.
Implementation Details
Updating the Data Model
To accomplish this change we need to remove these entities from several parts of the code base, and update some functionality of the Song Server.
StudyWithDonors
class exists, it can be removedidSearch
in AnalysisControllerPreserving Legacy Schemas
External Validation of Dynamic Schema Properties
In the current implementation, when Song was not used as an ID service, Song was able to fetch system wide IDs for Donors/Specimen/Samples via an HTTP request to a configurable ID server. With these entities no longer being part of the coded data model there is no longer a need to fetch IDs for these entities - Song will not be recording them.
However, the external ID check was used to ensure that entities had been registered with an ID system before accepting Analysis registration with Song. This is a feature we want to re-implement, but will now need to be part of Dynanic Schemas.
Additional context
The intention of this change is to simplify data management with Song, with two cases in particular:
The text was updated successfully, but these errors were encountered: