Skip to content

Conversation

Winter-Soren
Copy link
Contributor

What was wrong?

Currently, py-libp2p only provides an in-memory peerstore implementation that loses all peer data (addresses, keys, metadata, protocols) when the process restarts. This limits the resilience and performance of py-libp2p nodes, especially for long-running applications that need to maintain peer information across restarts.

Issue #945

How was it fixed?

Implemented a datastore-agnostic persistent peer storage system following the same architectural pattern as go-libp2p's pstoreds package. The solution provides:

Summary of approach

Core Components:

  1. Datastore Abstraction Layer (libp2p/peer/datastore/): Pluggable interface supporting multiple backends
  2. PersistentPeerStore (libp2p/peer/persistent_peerstore.py): Full async implementation with memory caching
  3. Factory Functions (libp2p/peer/persistent_peerstore_factory.py): Convenient creation methods

Supported Backends:

  1. SQLite: Simple file-based storage (default)
  2. LevelDB: High-performance key-value storage
  3. RocksDB: Advanced features with compression and bloom filters
  4. In-Memory: For testing and development
  5. Custom: User-defined implementations via IDatastore interface

To-Do

  • Clean up commit history
  • Add or update documentation related to these changes
  • Add entry to the release notes

Cute Animal Picture

image

@seetadev
Copy link
Contributor

@Winter-Soren : Great effort. Reviewing the PR.

Re-ran the CI/CD pipeline today after updating the branch.

Wish if you could resolve the CI/CD issues.

@seetadev
Copy link
Contributor

@Winter-Soren : HI Soham. Thank you for resolving the CI/CD issues.

1 CI/CD test is failing.

We investigated it and documented it at #949 .

@yashksaini-coder , @sumanjeet0012 and @acul71 are fixing it. We will soon have all the CI/CD tests passing.

@Winter-Soren : Will re-run CI/CD pipeline once the issue is fixed today or in a couple of days.

@yashksaini-coder
Copy link
Contributor

yashksaini-coder commented Sep 23, 2025

Can I get access for this to check and work, @seetadev @Winter-Soren

@Winter-Soren
Copy link
Contributor Author

Can I get access for this to check and work, @seetadev @Winter-Soren

Hey @yashksaini-coder,
You can add my branch Winter-Soren:feat/945-persistent-storage-for-peerstore as a remote and contribute to it!
If you have any questions, feel free to ask, happy to help :))

@Winter-Soren Winter-Soren marked this pull request as ready for review September 25, 2025 15:37
@seetadev
Copy link
Contributor

seetadev commented Oct 6, 2025

@Winter-Soren : Updated the branch and re-ran the CI/CD pipeline. There seems to be one test issue. Looking forward to collaborating with you and resolving it soon.

@seetadev
Copy link
Contributor

seetadev commented Oct 6, 2025

@Winter-Soren : The CI/CD issue got resolved. Wonderful.

@pacrob: Hi Paul. Wish to have your feedback on this PR.

On the same note, wish to share that I will ask @Winter-Soren and @yashksaini-coder to include a comprehensive test suite and also add a newsfragment.

@pacrob
Copy link
Member

pacrob commented Oct 12, 2025

I added some questions. In this case, I do think example usage would be important. Tests and a newsfragment are also needed here.

@seetadev
Copy link
Contributor

Thanks @pacrob for the detailed review and helpful comments 🙏

@Winter-Soren — please take a look at these points and update the PR soon. The switch to Trio (instead of asyncio) is important for consistency with the rest of the lib, so let’s refactor accordingly. Also, make sure the PeerData fields (last_identified, ttl, and latmap) are properly handled so we don’t lose data integrity.

Adding a couple of example usages, along with tests and a newsfragment, will round this out nicely and make it easier for others to build on your work.

Once you’ve made the updates, tag me and @pacrob for a quick re-review — this is very close to merge-ready! 🚀

@Winter-Soren
Copy link
Contributor Author

Hi @pacrob, thank you for the review. I’ve addressed your points:

1. Trio consistency

  • Replaced asyncio.Lock() with trio.Lock() in all datastore implementations (LevelDB, SQLite, RocksDB, Memory)
  • Updated examples to use trio.run() instead of asyncio.run()
  • Ensures consistency with the py-libp2p codebase

2. trio.from_thread usage

  • Removed excessive trio.from_thread.run_sync() calls
  • Synchronous methods operate on in-memory caches for immediate responses
  • Persistence happens via explicit async calls (await peerstore._load_peer_data())

3. PeerData fields persistence

  • Persisting last_identified, ttl, and latmap
  • Added _get_additional_key() to serialize/deserialize these fields
  • Updated _save_peer_data() and _load_peer_data() to handle all PeerData fields
  • Added _get_additional_key(peer_id) to cleanup operations

4. Examples, tests, and documentation

  • Added examples/persistent_peerstore/persistent_peerstore.py showing persistent peerstore usage
  • Added newsfragments/946.feature.rst describing the feature

@seetadev
Copy link
Contributor

@Winter-Soren : Great to hear, Soham. Appreciate your efforts.

@pacrob and I will review the changes and share feedback soon. Wish if you could also add a discussion page for this PR sharing more details on implementation, example and a small screencast.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants