Skip to content

Commit d5eac98

Browse files
committed
Fix Random.choice for Tensor on Python 3.11
In Python 3.11, the definition for rng.choice(seq) got the following expression added: `if not seq`. This internally calls `bool(seq)`, which throws an error if `seq` has more than one element. Now, pick random elements using `rng.randrange`, indexing into the tensor. Closes #148
1 parent 47451e5 commit d5eac98

File tree

4 files changed

+23
-18
lines changed

4 files changed

+23
-18
lines changed

CHANGELOG.md

+18-13
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,16 @@
11
# Changelog
22

3-
## Unreleased - v4.0.0-DEV
4-
Version 4 is a thorough rewrite of major parts of Vamb.
3+
## v4.0.1
4+
* Fix Random.choice for Tensor on Python 3.11. See issue #148
5+
6+
## v4.0.0
7+
Version 4 is a thorough rewrite of major parts of Vamb that has taken more than a year.
8+
Vamb now ships with with an upgraded dual variational autoencoder (VAE) and
9+
adversatial autoencoder (AAE) model, usable in a CheckM based workflow.
510
The code quality and test suite has gotten significant upgrades, making Vamb
611
more stable and robust to bugs.
7-
Vamb version is slightly faster and produces slightly better bins than v3.
8-
The user interface has only gotten slight changes.
12+
Vamb version is slightly faster and produces better bins than v3.
13+
The user interface has gotten limited changes.
914

1015
### Breaking changes
1116
* The official API of Vamb is now defined only in terms of its command-line
@@ -14,11 +19,8 @@ The user interface has only gotten slight changes.
1419
If you are using Vamb as a Python package, it means you should precisely
1520
specify the full version of Vamb used in order to ensure reproducibility.
1621
* Benchmark procedure has been changed, so benchmark results are incompatible
17-
with results from v3.
18-
In v3, a complete bin was defined as the total set of covered basepairs in any
19-
contig from the input assembly. In v4, it's defined as the genome of origin,
20-
from where contigs are sampled.
21-
This new procedure is more fair, more intuitive and easier to compute.
22+
with results from v3. Benchmarking is now considered an implementation detail,
23+
and is not stable across releases.
2224
* Vamb no longer outputs TNF, sequence names and sequence lengths as .npz files.
2325
Instead, it produces a `composition.npz` that contains all this information
2426
and more.
@@ -33,15 +35,18 @@ The user interface has only gotten slight changes.
3335
(though read the Notable changes section below).
3436

3537
### New features
38+
* Vamb now included an optional AAE model along the VAE model.
39+
Users may run the VAE model, where it behaves similarly to v3, or run the mixed
40+
VAE/AAE model, in which both models will be run on the same dataset.
41+
* The Snakemake workflow has been rehauled, and how defaults to using
42+
the VAE/AAE combined model, using CheckM to dereplicate.
3643
* Vamb is now more easily installed via pip: `pip install vamb`. We have fixed
3744
a bunch of issues that caused installation problems.
38-
* Added new flag: `--noencode`. With this flag, Vamb stops after producing the
39-
composition and depth outputs, and does not encode nor cluster.
40-
This can be used to produce the input data of Vamb to other clustering models.
4145
* By default, Vamb gzip compresses FASTA files written using the `--minfasta`
4246
flag.
4347

44-
### Notable changes
48+
### Notable other changes
49+
* Using the combined VAE-AAE workflow, the user can get significantly better bins.
4550
* Vamb now uses `CoverM` internally to calculate abundances. This means it is
4651
significantly faster and more accurate than before.
4752
Thus, we no longer recommend users computing depths with MetaBAT2's JGI tool.

test/ci.py

+2-2
Original file line numberDiff line numberDiff line change
@@ -26,13 +26,13 @@ def changelog_version(path):
2626
with open(path) as file:
2727
next(file) # header
2828
textline = next(filter(None, map(str.strip, file)))
29-
regex = re.compile(r"v([0-9]+)\.([0-9]+)\.([0-9]+)*(?:-([A-Za-z]+))")
29+
regex = re.compile(r"## v([0-9]+)\.([0-9]+)\.([0-9]+)(-[0-9A-Za-z]+)?")
3030
m = regex.search(textline)
3131
if m is None:
3232
raise ValueError("Could not find version in first non-header line of CHANGELOG")
3333
g = m.groups()
3434
v_nums = (int(g[0]), int(g[1]), int(g[2]))
35-
return v_nums if g[3] is None else (*v_nums, g[3])
35+
return v_nums
3636

3737

3838
def readme_vamb_version(path):

vamb/__init__.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@
1919
7) Split bins using vamb.vambtools
2020
"""
2121

22-
__version__ = (4, 0, 0)
22+
__version__ = (4, 0, 1)
2323

2424
from . import vambtools
2525
from . import parsebam

vamb/cluster.py

+2-2
Original file line numberDiff line numberDiff line change
@@ -529,11 +529,11 @@ def _wander_medoid(
529529
)
530530

531531
while len(cluster) - len(tried) > 0 and futile_attempts < max_attempts:
532-
sampled_medoid = int(rng.choice(cluster).item())
532+
sampled_medoid = int(cluster[rng.randrange(len(cluster))].item())
533533

534534
# Prevent sampling same medoid multiple times.
535535
while sampled_medoid in tried:
536-
sampled_medoid = int(rng.choice(cluster).item())
536+
sampled_medoid = int(cluster[rng.randrange(len(cluster))].item())
537537

538538
tried.add(sampled_medoid)
539539

0 commit comments

Comments
 (0)