Raku package with implementations of the K-Dimensional Tree (K-D Tree) algorithm.
Remark: This package should not be confused with "Algorithm::KdTree", [ITp1], which provides Raku bindings to a C-implementation of the K-D Tree algorithm. (A primary motivation for making this package, "Algorithm::KDimensionalTree", was to have a pure-Raku implementation.)
Remark: The versions "ver<0.1.0+>" with the API "api<1>" of this package are better utilized via the package "Math::Nearest".
Compared to the simple scanning algorithm this implementation of the K-D Tree algorithm is often 2÷15 times faster for randomly generated low dimensional data. (Say. less than 6D.) It can be 200+ times faster for "real life" data, like, Geo-locations of USA cities.
The implementation is tested for correctness against Mathematica's Nearest.
(See the resource files.)
- Finds both top-k Nearest Neighbors (NNs).
- See the method 
k-nearest. 
 - See the method 
 - Finds NNs within a ball with a given radius.
- See the method 
nearest-within-ball. 
 - See the method 
 - Can return indexes and distances for the found NNs.
- The result shape is controlled with the adverb 
:values. 
 - The result shape is controlled with the adverb 
 - Works with arrays of numbers, arrays of arrays of numbers, and arrays of pairs.
 - Utilizes distance functions from "Math::DistanceFunctions".
- Which can be specified by their string names, like, "bray-curtis" or "cosine-distance".
 
 - Allows custom distance functions to be used.
 
From Zef ecosystem:
zef install Algorithm::KDimensionalTree
From GitHub:
zef install https://github.com/antononcube/Raku-Algorithm-KDimensionalTree.git
use Algorithm::KDimensionalTree;
use Data::TypeSystem;
use Text::Plot;# (Any)
Make a random set of points:
my @points = ([(^100).rand, (^100).rand] xx 30).unique;
deduce-type(@points);# Vector(Vector(Atom((Numeric)), 2), 30)
my $kdTree = Algorithm::KDimensionalTree.new(@points);
say $kdTree;# Algorithm::KDimensionalTree(points => 30, distance-function => &euclidean-distance)
Use as a search point one from the points set:
my @searchPoint = |@points.head;# [89.27640360563157 68.36845738910054]
Find 6 nearest neighbors:
my @res = $kdTree.k-nearest(@searchPoint, 6);
.say for @res;# [89.27640360563157 68.36845738910054]
# [91.96352893754062 73.56174634865347]
# [80.0160216888336 69.80400458389097]
# [99.2564424012451 76.5797786385005]
# [79.5081497932748 52.564329376808196]
# [76.82139591710131 47.8616016873568]
Plot the points, the found nearest neighbors, and the search point:
my @point-char =  <* ⏺ ▲>;
say <data nns search> Z=> @point-char;
say text-list-plot(
[@points, @res, [@searchPoint,]],
:@point-char,
x-limit => (0, 100),
y-limit => (0, 100),
width => 60,
height => 20);# (data => * nns => ⏺ search => ▲)
# ++----------+-----------+----------+-----------+----------++       
# +                                                          + 100.00
# |             *                                            |       
# |           *                          *                   |       
# +                                                          +  80.00
# |                                                    ⏺    ⏺|       
# |               *                    *         ⏺    ▲      |       
# |                     *           *                        |       
# +                                                          +  60.00
# |      *      *                   *           ⏺            |       
# |                                            ⏺            *|       
# +    *                                 **       *          +  40.00
# |                                                          |       
# |                            *       *           *         |       
# |                   *                                      |       
# +              *                                           +  20.00
# |               *                                          |       
# | *                              *           *             |       
# +                                                          +   0.00
# ++----------+-----------+----------+-----------+----------++       
#  0.00       20.00       40.00      60.00       80.00      100.00
-  TODO Implementation
-  DONE Using distance functions from an "universal" package
- E.g. "Math::DistanceFunctions"
 
 - DONE Using distance functions other than Euclidean distance
 -  DONE Returning properties
- DONE Points
 - DONE Indexes
 - DONE Distances
 - DONE Labels
 - DONE Combinations of those
 - This is implemented by should be removed.
- There is another package -- "Math::Nearest" -- that handles all nearest neighbors finders.
 - Version "0.1.0 with "api<1>" is without the 
.nearestmethod. 
 
 -  DONE Having an umbrella function 
nearest- Instead of creating a KDTree object etc.
 - This might require making a functor 
nearest-function - This is better done in a different package
- See "Math::Nearest"
 
 
 -  TODO Make the nearest methods work with strings
- For example, using Hamming distance over a collection of words.
 - Requires using the distance function as a comparator for the splitting hyperplanes.
 - This means, any objects can be used as long as they provide a distance function.
 
 
 -  DONE Using distance functions from an "universal" package
 -  DONE Extensive correctness tests
- Derived with Mathematica / WL (see the resources)
 
 -  TODO Documentation
- DONE Basic usage examples with text plots
 -  TODO More extensive documentation with a Jupyter notebook
- Using "JavaScript::D3".
 
 - TODO Corresponding blog post
 - MAYBE Corresponding video
 
 
[AAp1] Anton Antonov, Math::DistanceFunctions Raku package, (2024), GitHub/antononcube.
[AAp2] Anton Antonov, Math::Nearest Raku package, (2024), GitHub/antononcube.
[AAp3] Anton Antonov, Data::TypeSystem Raku package, (2023), GitHub/antononcube.
[AAp4] Anton Antonov, Text::Plot Raku package, (2022), GitHub/antononcube.
[ITp1] Itsuki Toyota, Algorithm::KdTree Raku package, (2016-2024), GitHub/titsuki.