Skip to content

Commit 266c017

Browse files
committed
Docs: Extend algorithms
1 parent 46e957c commit 266c017

8 files changed

+236
-365
lines changed

.vscode/settings.json

+3
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@
1818
"cmake.sourceDirectory": "${workspaceRoot}",
1919
"cSpell.words": [
2020
"allowoverlap",
21+
"aminoacid",
2122
"aminoacids",
2223
"Apostolico",
2324
"Appleby",
@@ -32,6 +33,7 @@
3233
"Cawley",
3334
"cheminformatics",
3435
"cibuildwheel",
36+
"CONCAT",
3537
"copydoc",
3638
"cptr",
3739
"endregion",
@@ -103,6 +105,7 @@
103105
"substr",
104106
"SWAR",
105107
"Tanimoto",
108+
"thyrotropin",
106109
"TPFLAGS",
107110
"unigram",
108111
"usecases",

CONTRIBUTING.md

+1-22
Original file line numberDiff line numberDiff line change
@@ -107,8 +107,6 @@ cmake --build ./build_release --config Release # Which will produce the fol
107107
./build_release/stringzilla_bench_container <path> # for STL containers with string keys
108108
```
109109

110-
111-
112110
You may want to download some datasets for benchmarks, like these:
113111

114112
```sh
@@ -259,30 +257,11 @@ Alternatively, on Linux, the official Swift Docker image can be used for builds
259257
sudo docker run --rm -v "$PWD:/workspace" -w /workspace swift:5.9 /bin/bash -cl "swift build -c release --static-swift-stdlib && swift test -c release --enable-test-discovery"
260258
```
261259

262-
## Roadmap
263-
264-
The project is in its early stages of development.
265-
So outside of basic bug-fixes, several features are still missing, and can be implemented by you.
266-
Future development plans include:
267-
268-
- [x] [Replace PyBind11 with CPython](https://github.com/ashvardanian/StringZilla/issues/35), [blog](https://ashvardanian.com/posts/pybind11-cpython-tutorial/.
269-
- [x] [Bindings for JavaScript](https://github.com/ashvardanian/StringZilla/issues/25).
270-
- [x] [Reverse-order operations](https://github.com/ashvardanian/StringZilla/issues/12).
271-
- [ ] [Faster string sorting algorithm](https://github.com/ashvardanian/StringZilla/issues/45).
272-
- [x] [Splitting with multiple separators at once](https://github.com/ashvardanian/StringZilla/issues/29).
273-
- [ ] Universal hashing solution.
274-
- [ ] Add `.pyi` interface for Python.
275-
- [x] Arm NEON backend.
276-
- [x] Bindings for Rust.
277-
- [x] Bindings for Swift.
278-
- [ ] Arm SVE backend.
279-
- [ ] Stateful automata-based search.
280-
281260
## General Performance Observations
282261

283262
### Unaligned Loads
284263

285-
One common surface of attach for performance optimizations is minimizing unaligned loads.
264+
One common surface of attack for performance optimizations is minimizing unaligned loads.
286265
Such solutions are beautiful from the algorithmic perspective, but often lead to worse performance.
287266
It's often cheaper to issue two interleaving wide-register loads, than try minimizing those loads at the cost of juggling registers.
288267

README.md

+207-156
Large diffs are not rendered by default.

assets/cover-strinzilla.jpeg

399 KB
Loading
File renamed without changes.

assets/meme-stringzilla-v3.jpeg

92.3 KB
Loading

include/stringzilla/stringzilla.hpp

-19
Original file line numberDiff line numberDiff line change
@@ -24,25 +24,6 @@
2424
#define SZ_AVOID_STL (0) // true or false
2525
#endif
2626

27-
/**
28-
* @brief When set to 1, the strings `+` will return an expression template rather than a temporary string.
29-
* This will improve performance, but may break some STL-specific code, so it's disabled by default.
30-
* TODO:
31-
*/
32-
#ifndef SZ_LAZY_CONCAT
33-
#define SZ_LAZY_CONCAT (0) // true or false
34-
#endif
35-
36-
/**
37-
* @brief When set to 1, the library will change `substr` and several other member methods of `string`
38-
* to return a view of its slice, rather than a copy, if the lifetime of the object is guaranteed.
39-
* This will improve performance, but may break some STL-specific code, so it's disabled by default.
40-
* TODO:
41-
*/
42-
#ifndef SZ_PREFER_VIEWS
43-
#define SZ_PREFER_VIEWS (0) // true or false
44-
#endif
45-
4627
/* We need to detect the version of the C++ language we are compiled with.
4728
* This will affect recent features like `operator<=>` and tests against STL.
4829
*/

scripts/bench_similarity.ipynb

+25-168
Large diffs are not rendered by default.

0 commit comments

Comments
 (0)