Skip to content

Closes #112, adds VectorOptionsAttribute, #125

Draft
a-random-steve wants to merge 3 commits intodatastax:mainfrom
a-random-steve:mr/112
Draft

Closes #112, adds VectorOptionsAttribute, #125
a-random-steve wants to merge 3 commits intodatastax:mainfrom
a-random-steve:mr/112

Conversation

@a-random-steve
Copy link
Copy Markdown
Contributor

@a-random-steve a-random-steve commented Apr 2, 2026

Also moves LexicalOptionsAttribute to class level (not property)

Fixes #112

Comment on lines +38 to +44
public string SourceModel { get; set; }

/// <summary>The name of the embedding service provider.</summary>
public string Provider { get; set; }

/// <summary>The model name to use for embedding generation.</summary>
public string ModelName { get; set; }
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are ModelName and SourceModel redundant? e.g. looking at python, I only see model_name: https://docs.datastax.com/en/astra-api-docs/_attachments/python-client/astrapy/info.html#astrapy.info.CollectionVectorOptions

Copy link
Copy Markdown
Collaborator

@toptobes toptobes Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Collaborator

@sl-at-ibm sl-at-ibm Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution, these two should not be at the same level -- actually, assuming the shared goal of isomorphism with the Data API payloads, modelName should be within a "service" subobject together with provider (and authentication and/or parameters when they apply).

Here is a real payload for a createCollection with all the relevant stuff, for reference:

{
  "createCollection": {
    "name": "c",
    "options": {
      "vector": {
        "dimension": 1024,
        "metric": "dot_product",
        "service": {
          "provider": "nvidia",
          "modelName": "nvidia/nv-embedqa-e5-v5"
        },
        "sourceModel": "bert"
      }
    }
  }
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sl-at-ibm Agreed, they are at different levels.

public class VectorizeOptionsAttribute : Attribute
{
/// <summary>The number of dimensions for the vector.</summary>
public int Dimension { get; set; } = -1;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason for Dimension and Metric to be here? (as opposed to being only 1-level-higher, in VectorOptionsAttribute where they belong)
On first sight, these should be removed from this class (and some of the definition-filling work in CollectionDefinition slightly revised accordingly).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sl-at-ibm sine the VectorOptionsAttribute accepts SourceModel, which Vectorize doesn't, I don't see a benefit in having VectorizeOptionsAttribute subclass VectorOptionsAttribute. Also, please note that these are just possible helper attributes for consistency, the user can specify the collection creation details manually.

@toptobes toptobes marked this pull request as draft April 8, 2026 16:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

For Collections, add VectorOptionsAttribute and VectorizeOptionsAttribute

5 participants