Skip to content

DOCSP-40227: memory serialization #588

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Apr 15, 2025
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 10 additions & 2 deletions source/fundamentals/atlas-vector-search.txt
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,8 @@ To learn more about {+vector-search+}, see the :atlas:`{+vector-search+}
</atlas-vector-search/vector-search-overview/>` guide in the MongoDB Atlas
documentation.

.. _csharp-supported-vector-types:

Supported Vector Embedding Types
--------------------------------

Expand All @@ -50,7 +52,6 @@ search and retrieval.
The {+driver-short+} supports vector embeddings of several types. The following
sections describe the supported vector embedding types.


.. _csharp-vector-array-representation:

Array Representations
Expand All @@ -71,6 +72,12 @@ The following example shows a class with properties of the preceding types:
:start-after: start-bson-arrays
:end-before: end-bson-arrays

.. tip::

To learn more about using the ``Memory`` and ``ReadOnlyMemory``
types, see the :ref:`csharp-array-serialization` section of the
Serialization guide.

.. _csharp-binary-vector-representation:

Binary Vector Representations
Expand Down Expand Up @@ -190,4 +197,5 @@ guide, see the following API Documentation:
- `BinaryVectorFloat32 <{+new-api-root+}/MongoDB.Bson/MongoDB.Bson.BinaryVectorPackedBit.html>`__
- `ToQueryVector() <{+new-api-root+}/MongoDB.Driver/MongoDB.Driver.BinaryVectorDriverExtensions.ToQueryVector.html>`__
- `VectorSearch() <{+new-api-root+}/MongoDB.Driver/MongoDB.Driver.AggregateFluentBase-1.VectorSearch.html>`__
- `Aggregate() <{+new-api-root+}/MongoDB.Driver/MongoDB.Driver.IMongoCollectionExtensions.Aggregate.html>`__
- `Aggregate()
<{+new-api-root+}/MongoDB.Driver/MongoDB.Driver.IMongoCollectionExtensions.Aggregate.html>`__
65 changes: 65 additions & 0 deletions source/fundamentals/serialization.txt
Original file line number Diff line number Diff line change
Expand Up @@ -200,6 +200,71 @@ specified conventions, then passing it to the
var camelCaseConvention = new ConventionPack { new CamelCaseElementNameConvention() };
ConventionRegistry.Register("CamelCaseConvention", camelCaseConvention, t => true);

.. _csharp-array-serialization:

Improve Array Serialization Performance
---------------------------------------

You can improve your application's performance by representing
arrays of primitives as `Memory<T> <https://learn.microsoft.com/en-us/dotnet/api/system.memory-1?view=net-8.0>`__
and `ReadOnlyMemory<T> <https://learn.microsoft.com/en-us/dotnet/api/system.readonlymemory-1?view=net-8.0>`__
structs instead of by using types such as standard {+language+} arrays or
``BsonArray``. The driver implements fast serialization and
deserialization paths for ``Memory<T>`` and ``ReadOnlyMemory<T>``, which
enhances speed and reduces memory usage.

.. note::

Truncation and overflow checks are not supported for ``Memory<T>`` or
``ReadOnlyMemory<T>``, but these checks are implemented for standard
arrays.

You can effect these performance improvements by storing the following
primitive types in ``Memory<T>`` or ``ReadOnlyMemory<T>`` objects:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be "structs" instead of "objects".


- ``bool``
- ``sbyte``
- ``byte``
- ``char``
- ``short``
- ``ushort``
- ``int``
- ``uint``
- ``long``
- ``ulong``
- ``float``
- ``double``
- ``decimal``

The following example defines a ``Line`` POCO that contains array fields
modeled by ``Memory`` and ``ReadOnlyMemory`` structs:

.. literalinclude:: /includes/fundamentals/code-examples/MemorySerialization.cs
:start-after: start-line-class
:end-before: end-line-class
:language: csharp
:dedent:

The following document represents how a sample ``Line`` object is
represented in MongoDB:

.. code-block:: json

{
"_id": ...,
"X": [ 1, 2, 3, 4, 5 ],
"Y": [ 1, 1.409999966621399, 1.7300000190734863, 2, 2.240000009536743 ]
}

.. tip:: Model Vectors

:ref:`csharp-atlas-vector-search` involves creating and querying
large numerical arrays. If your application uses
{+vector-search+}, you might benefit from the performance
improvements from using ``Memory`` and ``ReadOnlyMemory`` to model
vector data. To learn more, see :ref:`csharp-supported-vector-types`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably worth mentioning the more significant optimization of using Binary Vector for vector search as well.
Or alternatively mention that this suggested optimization applied to Array representation of embeddings.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the wording here, but the linked page (atlas vector search) has a lot more info on the types/optimizations that are specific to vectors

in the {+vector-search+} guide.

Additional Information
----------------------

Expand Down
38 changes: 38 additions & 0 deletions source/includes/fundamentals/code-examples/MemorySerialization.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
using MongoDB.Bson;
using MongoDB.Bson.Serialization.Conventions;
using MongoDB.Driver;

public class Program
{

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove empty line.

public static void Main(string[] args)
{
// Replace with your connection string
const string uri = "<connection string>";

var mongoClient = new MongoClient(uri);
var database = mongoClient.GetDatabase("db");
var _collection = database.GetCollection<Line>("lines");

var line = new Line
{
X = new Memory<int>(new[] { 1, 2, 3, 4, 5 }),
Y = new ReadOnlyMemory<float>(new[] { 1f, 1.41f, 1.73f, 2f, 2.24f })
};

var filter = Builders<Line>.Filter.Empty;

var result = _collection.Find(filter).FirstOrDefault().ToJson();
Console.WriteLine(result);
}

}

// start-line-class
public class Line
{
public ObjectId Id { get; set; }
public Memory<int> X { get; set; }
public ReadOnlyMemory<float> Y { get; set; }
}
// end-line-class
3 changes: 2 additions & 1 deletion source/whats-new.txt
Original file line number Diff line number Diff line change
Expand Up @@ -392,7 +392,8 @@ The 2.26 driver release includes the following new features:
- Enabled use of native ``crypto`` in ``libmongocrypt`` bindings.

- Added support for serialization of ``Memory`` and ``ReadOnlyMemory``
structs.
structs. To learn more about implementing these types, see the
:ref:`csharp-array-serialization` section of the Serialization guide.

- Added support for the GCP Identity Provider when using the
``MONGODB-OIDC`` authentication mechanism. To learn more, see
Expand Down
Loading