-
Notifications
You must be signed in to change notification settings - Fork 100
Closed
Labels
polymer-performanceRuntime of loading and/or parametrizing (bio)polymersRuntime of loading and/or parametrizing (bio)polymers
Description
Describe the bug
Subgraph isormorphism is central to using the toolkit on multi-molecule systems but it is slow, especially for large and complicated systems.
To Reproduce
With a topology of fairly modest size and refactoring to use a Rust re-implementation of networkx (#2033), this takes about 10 minutes:
In [1]: from openff.toolkit import Topology
In [2]: topology = Topology.from_json(open("topology.json").read())
In [3]: %timeit -o -r1 -n1 topology.identical_molecule_groups
11min 2s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)
Out[3]: <TimeitResult : 11min 2s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)>
In [4]: !open .
In [5]: topology.n_atoms, topology.n_molecules
Out[5]: (4119, 10)Using the current main branch, it takes at least twice that (24 minutes, still running):
Output
Computing environment (please complete the following information):
- Operating system
- Output of running
conda list
Additional context
#1143 #1734 #353 #2008 openforcefield/openff-interchange#1156 etc.
Metadata
Metadata
Assignees
Labels
polymer-performanceRuntime of loading and/or parametrizing (bio)polymersRuntime of loading and/or parametrizing (bio)polymers
Type
Projects
Status
Done