GS-Annotator is a next-generation variant annotation platform designed for demanding research and clinical genomics workflows. It introduces an architecture optimized for speed, extensibility, and deployment simplicity.
The system processes standard VCF files and integrates annotations from curated gene models, clinical databases, population frequencies, and in silico prediction algorithms. Its flexible plugin framework and open API facilitate integration of custom annotation sources such as specialized variant scoring algorithms or domain-specific databases.
GS-Annotator’s backend leverages AStorage - a specialized server built on RocksDB, a high-performance embedded key-value store renowned for its efficient read/write operations on large datasets. This design is particularly beneficial in genomics, where queries must handle high variant counts with minimal latency. The system can support interactive analyses and batch processing of millions of variants while offloading compute-intensive tasks, substantially minimizing local resource consumption.
The Java-based API server utilizes Vert.x, an event-driven toolkit that ensures responsive performance across concurrent requests. Users can access functionality through an intuitive command-line interface or a RESTful API (documented with OpenAPI), enabling seamless integration with bioinformatics pipelines and downstream applications.
Additional features include support for genome liftover between reference assemblies and deep integration with AnFiSA, a comprehensive variant curation platform. Together, these tools create an end-to-end solution from variant calling to clinical interpretation.
Clone the master branch and package the application as a JAR file:
git clone [email protected]:ForomePlatform/gs-annotator.git
cd gs-annotator
./mvnw clean package
The JAR file will be generated inside the target directory as annotator-1.0.0.jar.
-
On the first run the application creates a data folder in the user’s home directory with the name GSAnnotator by default if not specified otherwise.
-
The service is running on port 8000 by default if not specified otherwise.
Note
|
These properties can be adjusted using a config.json file. |
config.json example:
{
"dataDirectoryPath": "/home/user/ExampleAnnotator",
"serverPort": 8000,
"aStorageServerUrl": "http://localhost:8080/"
}
To start the application run:
cd target
java -jar annotator-1.0.0.jar [config_json_path]
Note
|
Annotator logs are being written in <dataDirectoryPath>/output_<currentTimeMillis>.log file. Some of the output is printed in terminal where the program is being run. |
For detailed API specification access the OpenAPI UI via: http://localhost:8000/api.
To perform annotation of variants in your specified VCF you’ll need three files: input .vcf file, annotation configuration .cfg file and PLINK sample information .fam file. In case of performing VCF liftover additional files will be required.
Note
|
To use the annotation service AStorage service should be running on the configured AStorage server URL. |
The configuration file supports these fields:
-
"assembly": Final assembly version of the given VCF file (After VCF liftover, if performed).
-
"performLiftover": Boolean field to indicate if liftover should be performed to the given VCF.
Note
|
If "performLiftover" field is set to true, "chainFilePath" and "fastaFilePath" are mandatory. These files can be downloaded from UCSC Genome Browser. |
Note
|
VCF liftover and all the necessary steps for it are performed using tools from Picard. |
-
"chainFilePath": Local path to an appropriate Chain file.
-
"fastaFilePath": Local path to an appropriate FASTA file.
Note
|
VCF liftover requires an appropraite .dict file alongside the FASTA file. If such file can’t be located it’ll be automatically generated in the same directory as FASTA. |
.cfg file example if the original VCF is in GRCh37, and we want to liftover it to GRCh38:
{
"assembly": "GRCh38",
"performLiftover": true,
"chainFilePath": "/hg19ToHg38.over.chain",
"fastaFilePath": "/hg38.fa"
}
.fam file example:
1 bgm9001a1 bgm9001u2 bgm9001u1 1 2
1 bgm9001u1 0 0 2 1
1 bgm9001u2 0 0 1 1
curl -X POST -OJ 'http://localhost:8000/annotation/anfisa' -H 'accept: application/jsonl' -H 'Content-Type: multipart/form-data' -F 'cfgFile=@<path to .cfg file>' -F 'famFile=@<path to .fam file>' -F 'vcfFile=@<path to .vcf file>'
API reference: Anfisa Annotation.