Skip to content

Improve database management documentation #593

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 12 commits into
base: main
Choose a base branch
from
Draft

Improve database management documentation #593

wants to merge 12 commits into from

Conversation

withinfocus
Copy link
Contributor

@withinfocus withinfocus commented May 21, 2025

🎟️ Tracking

https://bitwarden.slack.com/archives/C065AUDER62/p1747770311208779

📔 Objective

Improves documentation given a first pass at #165 as well as some corrections started at #592. We can organize and separate EF and MSSQL better as well as include some additional tips and gotchas regularly-asked in chat. Additionally, "manual" migrations are not currently in place and may never be desired at this point so remove that content.

⏰ Reminders before review

  • Contributor guidelines followed
  • All formatters and local linters executed and passed
  • Written new unit and / or integration tests where applicable
  • Protected functional changes with optionality (feature flags)
  • Used internationalization (i18n) for all UI strings
  • CI builds passed
  • Communicated to DevOps any deployment requirements
  • Updated any necessary documentation (Confluence, contributing docs) or informed the documentation
    team

🦮 Reviewer guidelines

  • 👍 (:+1:) or similar for great changes
  • 📝 (:memo:) or ℹ️ (:information_source:) for notes or general info
  • ❓ (:question:) for questions
  • 🤔 (:thinking:) or 💭 (:thought_balloon:) for more open inquiry that's not quite a confirmed
    issue and could potentially benefit from discussion
  • 🎨 (:art:) for suggestions / improvements
  • ❌ (:x:) or ⚠️ (:warning:) for more significant problems or concerns needing attention
  • 🌱 (:seedling:) or ♻️ (:recycle:) for future improvements or indications of technical debt
  • ⛏ (:pick:) for minor or nitpick changes

Copy link

cloudflare-workers-and-pages bot commented May 21, 2025

Deploying contributing-docs with  Cloudflare Pages  Cloudflare Pages

Latest commit: abf861d
Status: ✅  Deploy successful!
Preview URL: https://f942c397.contributing-docs.pages.dev
Branch Preview URL: https://db-docs.contributing-docs.pages.dev

View logs

Copy link

github-actions bot commented May 21, 2025

Logo
Checkmarx One – Scan Summary & Detailsd385e846-8dd4-4649-b600-20123b4ca53f

Great job, no security vulnerabilities found in this Pull Request

@withinfocus withinfocus marked this pull request as ready for review May 21, 2025 22:07
@withinfocus withinfocus requested a review from a team as a code owner May 21, 2025 22:07
before falling back to the old location
- continued updating of the old data columns since in case of a rollback no data should be lost

### Non-backwards compatible migration
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 I think we should use affirmative language here. The 'non-backwards compatible migration' is talking about a 'breaking change'.

Suggested change
### Non-backwards compatible migration
### Breaking change migration

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this, but let's make sure to update everywhere, not just this heading

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I support the current term. The point of this documentation is that your change can't go backwards. It may be "breaking" to some, but the intention of the documentation is about direction.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can use affirmative language to describe that. The EDD page uses "finalization migration" as its term for this phase.

Suggested change
### Non-backwards compatible migration
### Finalization migration


## SQL database project

The separate database definitions in `src/Sql/.../dbo` serve as a "master" reference for the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⛏️ No need for scare quotes here.

Suggested change
The separate database definitions in `src/Sql/.../dbo` serve as a "master" reference for the
The separate database definitions in `src/Sql/.../dbo` serve as the source of truth for the

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's an analogy to a master copy of something, like an album recording. This content just moved its position by the way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I know. "The source of truth" is both more inclusive than "master reference", and more accurate as a description.

required. With the use of beta testing, partial roll-outs, [feature flags](../feature-flags.md),
etc. the often-chosen path is to spread a change across several major releases with a calculated
future state that can perform a "cleanup" migration that is backwards-compatible but still
represents an overall-_incompatible_ change beyond the boundaries of what we need for individual
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⛏️ Our communication guidelines indicate we should avoid italics and prefer bold because international fonts don't render italics well.

Suggested change
represents an overall-_incompatible_ change beyond the boundaries of what we need for individual
represents an overall-**incompatible** change beyond the boundaries of what we need for individual

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you link me to that? I honestly don't agree and there's a difference between bold and italic in intention. We use italics all the time on this site.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 59 to 65
Please take care to ensure:

- any existing stored procedure accepts the same input parameters and that new parameters have
nullable defaults
- when a column is renamed the existing stored procedures first check (coalesce) the new location
before falling back to the old location
- continued updating of the old data columns since in case of a rollback no data should be lost
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏻 Love that you're talking about specific concerns to keep aware of!

Comment on lines 47 to 51
required. With the use of beta testing, partial roll-outs, [feature flags](../feature-flags.md),
etc. the often-chosen path is to spread a change across several major releases with a calculated
future state that can perform a "cleanup" migration that is backwards-compatible but still
represents an overall-_incompatible_ change beyond the boundaries of what we need for individual
release safety.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't follow what you're going for with this sentence. If a migration is performed that leaves the DB in an incompatible state, it is by definition a finalization migration for the initial transitional migration. If the DB is always backwards compatible with the N-1th state, you can't build up migrations such that it's incompatible with the N-Mth.

Are you saying that our process is a cleanup so long after transitions that they are effectively unrelated and the finalization paradigm isn't apt?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not that they are unrelated -- they definitely are -- but, yes, the finalization need in this process is unnecessary. This is what we are actually doing 99% of the time here.

before falling back to the old location
- continued updating of the old data columns since in case of a rollback no data should be lost

### Non-backwards compatible migration
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this, but let's make sure to update everywhere, not just this heading

Comment on lines +78 to +79
compatible with the changes to `DbScripts`. In order to achieve this we only keep a single
migration, which executes all backwards incompatible schema changes.
Copy link
Member

@eliykat eliykat May 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that DbScripts_finalization only ever contains 1 file that everyone appends to? I'm not necessarily opposed to the requirement but I don't recall seeing it before.

Is the finalization script added at the same time as the backwards-compatible change? So there is only 1 finalization script at a time, it is always moved to DbScripts after the next rc cut date? What if you need a longer timeframe on your finalization?

(not being overly critical - just trying to understand this process given we've never really done it)

Copy link
Member

@Hinton Hinton May 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. It's always been mentioned.

You break up a pure DB non backwards compatible change into two steps. An example would be the column rename example in https://contributing.bitwarden.com/contributing/database-migrations/edd#migrations-1.

Initial migration would:

  • Add a new column
  • Add logic that keeps old and new in sync.

Transition migration would:

  • Migrate all data from old to new column.

Finalization migration (run at next deploy):

  • Removes the old now unused column.
  • Updates sprocs to remove the fallback logic.

Copy link
Member

@Hinton Hinton May 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do note we technically never have to remove the old column but not doing so adds database debt since we have

  • Unnecessary data in our database.
  • More complex sprocs.

This is why we'd like to automate it since relying on human elements here is dangerous since I could easily be OOO for a release or two or forget about cleaning it up.

Copy link
Member

@audreyality audreyality May 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finalization migration (run at next deploy):

This actually needs to be run 2 deploys away from the initial migration to maintain our self-host server guarantees. We can apply it earlier on our infrastructure.

Do note we technically never have to remove the old column...

Unless doing so would cross the the 8kb maximum record length or 1,024 column limit. I'd expect something will cross the 8kb maximum record length if we never remove columns. It'd be much harder to hit the column limit.

Copy link
Member

@eliykat eliykat May 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could've been clearer in my question - my previous understanding was that you might have several scripts in DbScripts_finalization as different teams add different finalization scripts. But this way, you have 1 finalization script at a time, which all teams append to. It's been a while since I read about this process so just trying to get my head around it again.

That said, it is unusual that multiple teams would append to the same file so this might be worth stating more explicitly, e.g.

Add your migration to the file in DbScripts/finalization, or create the file if it doesn't exist. Name it YYYY-0M-FinalizationMigration.sql [what should this date be?]. There should only be 1 finalization file at a time.

Rather than "Write a new migration, put it here, name it this" but then saying "we only keep a single migration".

Or if I've misunderstood please let me know.

Copy link
Member

@Hinton Hinton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We absolutely do need dbo_finalization for any automation to work. dbo must at all time accurately represent the state that running all DbScripts results in. The workflow relies on an automation running which moves content from _finalization back just having DbScripts will result in the automation failing due to inconsitencies between dbo and DbScripts.

I've created #594 which rebases this on ps/migrations to make it more clear how they differ.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎨 One of these files should note how our database version policy intersects with our self-hosted server support policy. Linking from these docs to the help site improves discoverability.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it though? My concern is that changes the scope of all this -- everything here is about a single release, forward or backward.

Copy link
Member

@audreyality audreyality May 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original scope of this document addressed the support policy by ensuring finalization can run at any future point after release. If you remove finalization, then you must talk about how we guarantee our self-hosted server maintains its N-2 release compatibility.

Either way, the pages should at minimum be cross-linked.

@withinfocus withinfocus marked this pull request as draft May 22, 2025 12:48
@withinfocus
Copy link
Contributor Author

I addressed a few comments on the smaller things and left responses to the rest. I am gonna move over to server and finish the finalization work because there's been a strong reaction to keeping it (and not documenting how things are today, albeit incomplete to reintroduce later), and if that works out can restore the couple sentences about it as well as further bolster the "why".

@withinfocus
Copy link
Contributor Author

Relates to bitwarden/server#5855 where I'm making the remaining EDD changes.

@withinfocus
Copy link
Contributor Author

I have no more changes planned for this and open comments are either needing responses or something I am not currently comfortable changing further.

audreyality added a commit that referenced this pull request May 23, 2025
* confirm passwords match between .env and secrets.json
* add tip to test database connectivity
* move confirmation steps into tabs
* align headers with #593; add link to migrations page
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants