Skip to content

Conversation

@SuchitraSwain
Copy link

issue: #942

Description:
This enhancement optimizes the Kademlia DHT implementation by replacing the current O(n²) peer lookup algorithm with an efficient O(n log k) heap-based approach, implementing memory-efficient peer selection, and improving error handling with adaptive delays. The optimization significantly improves scalability and reduces resource consumption in large peer-to-peer networks.

Performance Improvements:

  • Algorithm Complexity: O(n²) → O(n log k) using heap-based peer selection
  • Memory Usage: O(n) → O(k) where k is desired peer count
  • Error Handling: Fixed 10ms delays → Adaptive delays (1ms-100ms based on error type)
  • Benchmark Results: Up to 61.9% improvement and 2.7x speedup for large peer sets

Key Changes:

  1. Heap-Based Peer Selection (libp2p/kad_dht/utils.py):

    • Added find_closest_peers_heap() with O(n log k) complexity
    • Added find_closest_peers_streaming() for memory efficiency
    • Maintains identical results to original implementation
  2. Optimized Routing Table (libp2p/kad_dht/routing_table.py):

    • Updated find_local_closest_peers() to use heap-based approach
    • Reduced memory usage for large peer sets
  3. Enhanced Peer Routing (libp2p/kad_dht/peer_routing.py):

    • Added early termination conditions for convergence detection
    • Implemented distance-based early stopping
    • Optimized set operations for queried peer tracking
  4. Adaptive Error Handling (libp2p/tools/adaptive_delays.py):

    • New AdaptiveDelayStrategy class with intelligent error classification
    • Exponential backoff with jitter to prevent thundering herd
    • Updated yamux implementation to use adaptive delays
  5. Comprehensive Testing:

    • Performance validation tests (tests/core/kad_dht/test_performance_optimizations.py)
    • Benchmarking tools (benchmarks/dht_performance_benchmark.py)
    • Demonstrated significant performance gains through benchmarks

Production Impact:

  • Discovery Time: Faster peer discovery in large networks
  • Resource Usage: Lower CPU and memory consumption per node
  • Network Growth: Better support for enterprise-level peer counts (1000+ peers)
  • Error Recovery: Faster and more intelligent retry mechanisms

Backward Compatibility:

  • No changes to public APIs
  • Identical results to original implementation
  • Existing code continues to work without modification
  • Performance improvements are transparent to users

Files Modified:

  • libp2p/kad_dht/utils.py - Heap-based peer selection functions
  • libp2p/kad_dht/routing_table.py - Optimized local peer lookup
  • libp2p/kad_dht/peer_routing.py - Enhanced network lookup with early termination
  • libp2p/tools/adaptive_delays.py - New adaptive delay strategy
  • libp2p/stream_muxer/yamux/yamux.py - Updated error handling
  • libp2p/tools/init.py - Export new utilities
  • tests/core/kad_dht/test_performance_optimizations.py - Performance validation
  • benchmarks/dht_performance_benchmark.py - Benchmarking tools
  • PERFORMANCE_OPTIMIZATION_SUMMARY.md - Comprehensive documentation

@SuchitraSwain
Copy link
Author

Please review this PR @seetadev

@seetadev
Copy link
Contributor

seetadev commented Oct 8, 2025

@SuchitraSwain : Wish to share that @sumanjeet0012 will be sharing key pointers here. We did realize that there are some gaps which need to be addressed. Will ask him to share his feedback here.

@sumanjeet0012
Copy link
Contributor

@SuchitraSwain Kindly rebase the PR on latest main branch and fix the CI CD issues.

@SuchitraSwain SuchitraSwain force-pushed the feature/dht-performance-optimizations branch from e8b14e4 to b6b6f50 Compare October 9, 2025 16:20
@SuchitraSwain
Copy link
Author

@seetadev @sumanjeet0012 Please check

@sumanjeet0012
Copy link
Contributor

@SuchitraSwain The PR still contains commit of flood publishing PR,
Please do rebase the PR on latest main branch.

Description:
This enhancement optimizes the Kademlia DHT implementation by replacing the current O(n²) peer lookup algorithm with an efficient O(n log k) heap-based approach, implementing memory-efficient peer selection, and improving error handling with adaptive delays. The optimization significantly improves scalability and reduces resource consumption in large peer-to-peer networks.

Performance Improvements:
- Algorithm Complexity: O(n²) → O(n log k) using heap-based peer selection
- Memory Usage: O(n) → O(k) where k is desired peer count
- Error Handling: Fixed 10ms delays → Adaptive delays (1ms-100ms based on error type)
- Benchmark Results: Up to 61.9% improvement and 2.7x speedup for large peer sets

Key Changes:
1. Heap-Based Peer Selection (libp2p/kad_dht/utils.py):
   - Added find_closest_peers_heap() with O(n log k) complexity
   - Added find_closest_peers_streaming() for memory efficiency
   - Maintains identical results to original implementation

2. Optimized Routing Table (libp2p/kad_dht/routing_table.py):
   - Updated find_local_closest_peers() to use heap-based approach
   - Reduced memory usage for large peer sets

3. Enhanced Peer Routing (libp2p/kad_dht/peer_routing.py):
   - Added early termination conditions for convergence detection
   - Implemented distance-based early stopping
   - Optimized set operations for queried peer tracking

4. Adaptive Error Handling (libp2p/tools/adaptive_delays.py):
   - New AdaptiveDelayStrategy class with intelligent error classification
   - Exponential backoff with jitter to prevent thundering herd
   - Updated yamux implementation to use adaptive delays

5. Comprehensive Testing:
   - Performance validation tests (tests/core/kad_dht/test_performance_optimizations.py)
   - Benchmarking tools (benchmarks/dht_performance_benchmark.py)
   - Demonstrated significant performance gains through benchmarks

Production Impact:
- Discovery Time: Faster peer discovery in large networks
- Resource Usage: Lower CPU and memory consumption per node
- Network Growth: Better support for enterprise-level peer counts (1000+ peers)
- Error Recovery: Faster and more intelligent retry mechanisms

Backward Compatibility:
- No changes to public APIs
- Identical results to original implementation
- Existing code continues to work without modification
- Performance improvements are transparent to users

Files Modified:
- libp2p/kad_dht/utils.py - Heap-based peer selection functions
- libp2p/kad_dht/routing_table.py - Optimized local peer lookup
- libp2p/kad_dht/peer_routing.py - Enhanced network lookup with early termination
- libp2p/tools/adaptive_delays.py - New adaptive delay strategy
- libp2p/stream_muxer/yamux/yamux.py - Updated error handling
- libp2p/tools/__init__.py - Export new utilities
- tests/core/kad_dht/test_performance_optimizations.py - Performance validation
- benchmarks/dht_performance_benchmark.py - Benchmarking tools
- PERFORMANCE_OPTIMIZATION_SUMMARY.md - Comprehensive documentation

This addresses all performance bottlenecks identified in issue libp2p#942 and provides a solid foundation for scaling libp2p networks to enterprise-level peer counts while maintaining reliability and correctness.
@SuchitraSwain SuchitraSwain force-pushed the feature/dht-performance-optimizations branch from 322a4ee to 374b90f Compare October 9, 2025 17:00
@SuchitraSwain
Copy link
Author

@sumanjeet0012 Check now please

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants