You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This commit was created on GitHub.com and signed with GitHub’s verified signature.
v2.4.1 (Minor Update, 23.05.2025)
Implementation of HDBSCAN* clustering of alleles and proteoforms of features (https://tribuo.org/ library): After inference of the allele and proteoform sequences per sample, these are now clustered using the Tribuo library's HDBSCAN* algorithm to increase the interpretability of the data, i.e. samples that fall into the same clusters in terms of features can be considered similar even if they do not have the exact same set of variants in terms of features. Clustering is done using L1 distance based on binary features represented by all available variants (position & alternative content) of the feature - this means in particular that clustering is not stable across different sets of variants.
The clustering results are used to generate informative names for alleles and proteoforms: these names have been adapted to be used in the different output formats.