Added benchmarking for pinecone #17

var77 · 2023-09-29T13:27:16Z

Added benchmarking for Pinecone
Currently this only benchmarks index creation latency. (create index + upsert data in batches)
The core/utils/pinecone_async_index.py is just wrapper for Pinecone Index class, so async requests will be supported when querying index. Referance

dqii · 2023-09-29T17:49:35Z

.env

+SSH_PORT_LANTERN=2222
+PINECONE_API_KEY='YOUR_PINECONE_API_KEY_HERE'


Coming from JS background so following some conventions from there. What do you think about

putting this in .env.local

putting .env.local in .gitignore

adding a check that these two variables are defined when testing Pinecone
I'm worry about accidentally committing these variables

Yes I think moving .env to gitignore and adding .env.example may be an option. Also having checks before using these variables will help to provide better user facing error message

dqii · 2023-09-29T17:50:39Z

core/utils/cloud_provider.py

+
+def get_cloud_provider(provider_name):
+    if provider_name == Cloud.PINECONE:
+        return Pinecone(os.environ['PINECONE_API_KEY'], os.environ['PINECONE_ENV'])


We could do the "variables exist" check here

dqii · 2023-09-29T17:52:45Z

core/utils/constants.py

@@ -33,6 +36,7 @@ class Extension(Enum):
    Extension.PGVECTOR_HNSW: {'m': 32, 'ef_construction': 128, 'ef': 10},
    Extension.LANTERN: {'m': 32, 'ef_construction': 128, 'ef': 10},
    Extension.NEON: {'m': 32, 'ef_construction': 128, 'ef': 10},
+    Cloud.PINECONE: { 'name': '', 'metric': 'cosine', 'pods': 1, 'replicas': 1, 'pod_type': 'p2' },


I think all the other indices are using L2 by default, and sift uses the L2 metric for ground truth.

There was an option of euclidean distance which seems to be l2-norm, but I think in our index it is l2 squared. Will it work as expected?

Added benchmarking for pinecone

95fa217

dqii reviewed Sep 29, 2023

View reviewed changes

Add env var check function, change distance to l2

27f3e03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Added benchmarking for pinecone #17

Added benchmarking for pinecone #17

Uh oh!

var77 commented Sep 29, 2023

Uh oh!

dqii Sep 29, 2023

Uh oh!

var77 Sep 29, 2023

Uh oh!

dqii Sep 29, 2023

Uh oh!

dqii Sep 29, 2023

Uh oh!

var77 Sep 29, 2023

Uh oh!

Uh oh!

		SSH_PORT_LANTERN=2222
		PINECONE_API_KEY='YOUR_PINECONE_API_KEY_HERE'

Added benchmarking for pinecone #17

Are you sure you want to change the base?

Added benchmarking for pinecone #17

Uh oh!

Conversation

var77 commented Sep 29, 2023

Uh oh!

dqii Sep 29, 2023

Choose a reason for hiding this comment

Uh oh!

var77 Sep 29, 2023

Choose a reason for hiding this comment

Uh oh!

dqii Sep 29, 2023

Choose a reason for hiding this comment

Uh oh!

dqii Sep 29, 2023

Choose a reason for hiding this comment

Uh oh!

var77 Sep 29, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!