Skip to content

Conversation

@taddes
Copy link
Collaborator

@taddes taddes commented Nov 13, 2025

Description

Introduces the logic to lock the db for reads and writes for the Postgres implementation.

Will merge in and rebase based on changes from #1896

Issue(s)

Closes STOR-355.

@taddes taddes self-assigned this Nov 13, 2025
@taddes taddes marked this pull request as ready for review November 14, 2025 01:37
@taddes taddes changed the title feat: postgres read and write locks [WIP] feat: postgres read and write locks Nov 14, 2025
| `fxa_kid` | `TEXT` | Key identifier; part of the sync crypto context. PK (part 2) |
| `collection_id` | `INTEGER` | Maps to a named collection. PK (part 3) |
| `modified` | `TIMESTAMP` | Last modification time (server-assigned, updated on writes) |
| `modified` | `BIGINT` | Last modification time (server-assigned, updated on writes) |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this too bothersome to support as a TIMESTAMP?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessarily, may opt for a raw query over the ORM methods, as we've discussed, given those can be a pain with the chain methods. Do we prefer this to be a timestamp? I know in Spanner it's a timestamp and in MySQL a BIGINT

Copy link
Collaborator Author

@taddes taddes Nov 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now, @pjenvey I say let's keep it consistent as an integer, given the integrated testing for MySQL and Postgres, much like in Tokenserver. We have a ticket on the books to standardize all datetime ops to chrono, which also happens to include making sure any ops that should move to datetime ones, opposed to using large numbers will update. This way we can update MySQL and Postgres at the same time. Thoughts?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given the integrated testing for MySQL and Postgres

Sorry I don't have enough context to understand this. Can you link to a test as an example? Thanks!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A good example is like the work you did for the Postgres tests that ran in CI for Tokenserver. The reason they were general purpose and could work for both SQL and Postgres was that the interfaces were quite similar. My thought here is that we try and keep true-ish to MySQL and as we near the end of integrating the tests, look at making adjustments if need be. Changing a data type from an i64 to a timestamp for instance results in pretty different logic that would check it. This way we're not duplicating things before we need to.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I understand your explanation, @taddes, thanks. But I don't agree or maybe simply don't understand why the existing tests should stop us from using the correct data type(s).

My thought here is that we try and keep true-ish to MySQL and as we near the end of integrating the tests, look at making adjustments if need be.

How do we decide if adjustments are needed?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests are against the Db/DbPool interfaces, whereas the column type here is an internal impl. detail. By the time our Box<dyn Db> returns e.g. a results::GetBso, it would have converted whatever internal representation of modified (whether BIGINT or TIMESTAMP) to a SyncTimestamp.

-- user_collections table
CREATE TABLE user_collections (
fxa_uid UUID NOT NULL,
fxa_uid BIGINT NOT NULL,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this change. Should the column also be renamed? iirc the last time the team chatted about this type we want it to be TEXT.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, but I'm killing fxa_uid/kid shortly, so it doesn't matter

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'll be in another PR when we change the whole field. Also the PR is still WIP

///
/// In theory it would be possible to use serializable transactions rather
/// than explicit locking, but our ops team have expressed concerns about
/// the efficiency of that approach at scale.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool. Are there written notes on that somewhere?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will look around and see regarding the ops team stuff, though the first portion is what I took from Diesel's internal docs on these lock methods.

@taddes taddes changed the title [WIP] feat: postgres read and write locks feat: postgres read and write locks Nov 18, 2025
@taddes taddes requested a review from pjenvey November 18, 2025 17:42
@taddes taddes force-pushed the feat/postgres-read-write-locks-STOR-335 branch from d3b7ac4 to e90a84c Compare November 18, 2025 22:45
self.session
.coll_locks
.insert((user_id as u32, collection_id), CollectionLock::Read);
Ok(())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is identical to the one in syncstorage-mysql/src/db/db_impl.rs. Do we need two copies of this function?

(The MySQL copy also has two questions in the comments I'm very interested in.)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! I am curious about those comments as well. It's likely we can move said function into a shared crate at some point if both of them use it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's some small opportunities to share more common code between the different backends but unfortunately we can't share general diesel ORM usage (utilizing diesel-async) between different backends easily (if at all), at least at the moment.

Basically because diesel's very strict: its base API (and its more internal associated types/trait bounds that come along with it) is typed to specific db backends. This has the benefit of enforcing specific backend dialect API usage at compile time, but with the downside of making it difficult to write code against its generic traits like diesel::Connection -- it just wasn't designed for such usage.

However plain diesel added a kind of layer on top of its core called MultiConnection that enables this kind of usage, but unfortunately it's not yet supported by diesel-async.

The original plan was to prefer raw sql queries for most cases anyway, moving away from the mysql ORM styled code like tokenserver does. Part of building postgres "from scratch" is we're able to make a sort of audit of non spanner impls as the mysql version was never fully finalized (granted self hosters seemed to have had success with it).

I'm not opposed to the ORM usage though, if it's easy enough to use -- in fact I intended for raw sql in most cases but not necessarily all -- I definitely prefer the ORM for at least one case: the get_bsos/_ids call where we dynamically build a query is much easier/cleaner/less bug ridden with an ORM building it for us vs manually constructing it.

Copy link
Member

@pjenvey pjenvey Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The second question I've covered here: #1891 (comment)

The first question sounds like more of a TODO/bug, we shouldn't be casting a u64 to a u32 here (or in mysql)! Let's change the type in the coll_ maps to match legacy_id's. We could even put the entire UserIdentifier in there instead, as long as we're sure it eq/hashes correctly for such usage.

self.session
.coll_locks
.insert((user_id as u32, collection_id), CollectionLock::Write);
Ok(())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same with this function. It's functionally identical to the MySQL one. (There's a to_string here and a to_owned there for the error message so it's not syntactically 100% identical.)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Certainly can swap it, since those to_string and to_owned do much of the same thing.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I'm pretty sure I responded but I guess the packets leaked into the ocean or something.)

The to_string vs to_owned was just an aside. My point was the Postgres and MySQL versions of the fn are essentially identical.

@chenba
Copy link
Collaborator

chenba commented Nov 19, 2025

Related to this PR but more of a general question. When are entries in those process-local hashmaps of locks updated? I see inserts but not removals. I feel like I'm missing something very fundamental on db level locks vs what's in those hashmaps.

@chenba
Copy link
Collaborator

chenba commented Nov 19, 2025

When are entries in those process-local hashmaps of locks

Oh the hashmaps are per request look like. I think I understand now.

@pjenvey
Copy link
Member

pjenvey commented Nov 20, 2025

Related to this PR but more of a general question. When are entries in those process-local hashmaps of locks updated? I see inserts but not removals. I feel like I'm missing something very fundamental on db level locks vs what's in those hashmaps.

These were lazily carried over from the Python version: they're never removed, and only really used as a sanity check in case the lock methods are called more than once (which never happens).

This could probably be improved/simplified -- we could remove them upon commit/rollback, but if we're doing that we might consider making commit/rollback consume self (if our usage of those methods allows that).

@taddes taddes force-pushed the feat/postgres-read-write-locks-STOR-335 branch from e90a84c to 211641f Compare November 20, 2025 22:10
@taddes taddes force-pushed the feat/postgres-read-write-locks-STOR-335 branch from 211641f to 0e21ca5 Compare November 20, 2025 23:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants