Skip to content

INTPYTHON-355 Add transaction support #313

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 16 commits into
base: main
Choose a base branch
from

Conversation

aclark4life
Copy link
Collaborator

@aclark4life aclark4life commented Jun 4, 2025

todo:

  • transaction support in migrations
  • investigate remaining test failures
  • determine why there's a large speed differences between tests Atlas and GitHub actions w/replica set
  • test on sharded cluster?

@aclark4life aclark4life changed the title Check for replica set or sharded cluster INTPYTHON-355 Check for replica set or sharded cluster Jun 5, 2025
@@ -97,6 +95,22 @@ class DatabaseFeatures(BaseDatabaseFeatures):
"expressions.tests.ExpressionOperatorTests.test_lefthand_transformed_field_bitwise_or",
}

@cached_property
def supports_transactions(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the purpose of this property? Is it just for the test suite or do apps rely on it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not just for testing.

In this case, rather than declare False or True for supports_transactions we dynamically set True or False based on what type of MongoDB we are connected to.

Then DatabaseFeatures is passed to DatabaseWrapper IIRC and the runtime behavior of Django changes accordingly.

Based on this: https://github.com/django/django/blob/main/django/db/backends/base/features.py#L419-L431

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It depends on what we want to implement. If we want to dynamically enable transactions based on whether or not the database supports it, we could use it internally. Otherwise, we'll have to introduce a separate setting the user sets to indicate whether or not they want to use transactions, and raise an error if it's not supported.

It's also used by Django in a couple places. For example, to determine whether or not to use transactions to speed up tests.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is that there are certain storage engines where a replica set or sharded cluster does not actually support transactions so this check isn't 100% accurate. That said, those should be rare so I'm fine with the current "hello" based approach. We can always revisit later if it causes an issue.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in c8f8881 (assuming wired tiger storage engine is correct or we could also check to make sure it is not mmapv1)

@@ -97,6 +95,22 @@ class DatabaseFeatures(BaseDatabaseFeatures):
"expressions.tests.ExpressionOperatorTests.test_lefthand_transformed_field_bitwise_or",
}

@cached_property
def supports_transactions(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is that there are certain storage engines where a replica set or sharded cluster does not actually support transactions so this check isn't 100% accurate. That said, those should be rare so I'm fine with the current "hello" based approach. We can always revisit later if it causes an issue.

@aclark4life aclark4life changed the title INTPYTHON-355 Check for replica set or sharded cluster INTPYTHON-355 Add transaction support Jun 6, 2025
@aclark4life aclark4life marked this pull request as ready for review June 6, 2025 23:37
@aclark4life

This comment was marked as resolved.

@aclark4life aclark4life requested a review from timgraham June 6, 2025 23:39
@timgraham timgraham force-pushed the INTPYTHON-355 branch 2 times, most recently from e68f8c9 to 4a8209a Compare June 7, 2025 14:37
@timgraham timgraham force-pushed the INTPYTHON-355 branch 3 times, most recently from 8205175 to 1c784eb Compare June 8, 2025 02:08
@@ -140,6 +142,10 @@ def _isnull_operator(a, b):
ops_class = DatabaseOperations
validation_class = DatabaseValidation

def __init__(self, settings_dict, alias=DEFAULT_DB_ALIAS):
super().__init__(settings_dict, alias=alias)
self.session = None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about thread safety here?

For example:

# Thread 1
with atomic():
    ....
# Thread 2
with atomic():
    ....

Won't the session property be mistakenly shared between threads?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean if the session property were to be mistakenly shared between threads that would be a Django bug … there's nothing we can do within the framework they provide (_commit, _rollback, etc) to manage threads as far as I know.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Django, database connections are thread-local, so the use of atomic() in two separate threads will use two separate connections (and thus two separation sessions). It's no different from any other state stored on DatabaseWrapper

[My understanding of thread locals, etc. isn't 100% rock solid, particularly since pymongo operates a bit differently than the PEP 249-conforming drivers for Django's other database backends, but I think what I've stated is correct. Let me know if you have a particular concern.]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, since each thread gets it's own instance of our DatabaseWrapper I agree this is not an issue. I have a feeling this is not the first nor the last time I will ask this same question so bare with me, Django uses a different model than I'm used to.

@aclark4life aclark4life requested a review from Jibola June 9, 2025 17:44
@timgraham timgraham marked this pull request as draft June 9, 2025 19:28
@aclark4life
Copy link
Collaborator Author

investigate remaining test failures

Referring to tests defined in _django_test_expected_failures_transactions or something else ?

is_sharded_cluster = hello_response.get("msg") == "isdbgrid"
if is_replica_set or is_sharded_cluster:
engine = client.command("serverStatus").get("storageEngine", {})
return engine.get("name") == "wiredTiger"
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find this less readable than the logical grouping, but not enough to protest more than adding this comment!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I nested it to avoid the call to command("serverStatus") when unnecessary, since Shane mentioned it's expensive.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants