1
1
import Tabs from " @theme/Tabs" ;
2
2
import TabItem from " @theme/TabItem" ;
3
3
4
- # Evolutionary Database Design
4
+ # Evolutionary database design
5
5
6
- At Bitwarden we follow
7
- [ Evolutionary Database Design (EDD)] ( https://en.wikipedia.org/wiki/Evolutionary_database_design ) .
8
- EDD describes a process where the database schema is continuously updated while still ensuring
9
- compatibility with older releases by using database transition phases.
6
+ At Bitwarden we follow [ Evolutionary Database Design (EDD)] [ edd-wiki ] . EDD describes a process where
7
+ the database schema is continuously updated while still ensuring compatibility with older releases
8
+ by defining a database transition phases.
10
9
11
- In short the Database Schema for the Bitwarden Server ** must** support the previous release of the
12
- server. The database migrations will be performed before the code deployment, and in the event of a
13
- release rollback the database schema will ** not** be updated.
10
+ Bitwarden also needs to support:
11
+
12
+ - ** Zero-downtime deployments** : Which means that multiple versions of the application will be
13
+ running concurrently during the deployment window.
14
+ - ** Code rollback** : Critical defects in code should be able to be rolled back to the previous
15
+ version.
16
+
17
+ To fulfill these additional requirements the database schema ** must** support the previous release
18
+ of the server.
14
19
15
20
<bitwarden >
16
21
@@ -24,26 +29,76 @@ For background on this decision please see the [Evolutionary Database Design RFD
24
29
25
30
## Design
26
31
27
- ### Nullable
32
+ Database changes can be categorized into two categories: destructive and non-destructive changes
33
+ \[ [ 1] ( ./edd#further-reading ) \] . A destructive change prevents existing functionality from working as
34
+ expected without an accompanying code change. A non-destructive change is the opposite: a database
35
+ change that does not require a code change to allow the non-application to continue working as
36
+ expected.
37
+
38
+ ### Non-destructive changes
39
+
40
+ Many database changes can be designed in a backwards compatible manner by using a mix of nullable
41
+ fields and default values in the database tables, views, and stored procedures. This ensures that
42
+ the stored procedures can be called without the new columns and allow them to run with both the old
43
+ and new code.
44
+
45
+ ### Destructive changes
46
+
47
+ Any change that cannot be done in a non-destructive manner is a destructive change. This can be as
48
+ simple as adding a non nullable column where the value needs to be computed from existing fields, or
49
+ renaming an existing column. To handle destructive changes it's necessary to break them up into
50
+ three phases: _ Start_ , _ Transition_ , and _ End_ as shown in the diagram below.
51
+
52
+ <figure >
53
+
54
+ ![ Refactoring Stages] ( ./transitions.png )
55
+
56
+ <figcaption >Refactoring Phases</figcaption >
57
+
58
+ </figure >
59
+
60
+ It's worth noting that the _ Refactoring Phases_ are usually rolling, and the _ End phase_ of one
61
+ refactor is the _ Transition phase_ of another. The table below details which application releases
62
+ needs to be supported during which database phase.
28
63
29
- Database tables, views and stored procedures should almost always use either nullable fields or have
30
- a default value. Since this will allow stored procedures to omit columns, which is a requirement
31
- when running both old and new code.
64
+ | Database Phase | Release X | Release X+1 | Release X+2 |
65
+ | -------------- | --------- | ----------- | ----------- |
66
+ | Start | ✅ | ❌ | ❌ |
67
+ | Transition | ✅ | ✅ | ❌ |
68
+ | End | ❌ | ✅ | ✅ |
32
69
33
- ### EDD Process
70
+ ### Migrations
34
71
35
- The EDD breaks up each database migration into three phases. _ Start_ , _ Transition_ and _ End_ .
72
+ The three different migrations described in the diagram above are, _ Initial migration_ , _ Transition
73
+ migration_ and _ Finalization migration_ .
36
74
37
- ![ Refactoring Stages] ( ./stages_refactoring.jpg )
38
- [ https://www.martinfowler.com/articles/evodb.html#TransitionPhase ] ( https://www.martinfowler.com/articles/evodb.html#TransitionPhase )
75
+ #### Initial migration
39
76
40
- This necessitates two different database migrations. The first migration adds new content and is
41
- backwards compatible with the existing code. The second migration removes content and is not
42
- backwards compatible with that same code prior to the first migration.
77
+ The initial migration runs before the code deployment, and its purpose is to add support for
78
+ _ Release X+1_ without breaking support of _ Release X_ . The migration should execute quickly and not
79
+ contain any costly operations to ensure zero downtime.
80
+
81
+ #### Transition migration
82
+
83
+ The transition migration are run sometime during the transition phase, and provides an optional data
84
+ migration should it be too slow or put too much load on the database, or otherwise make it
85
+ unsuitable for the _ Initial migration_ .
86
+
87
+ - Compatible with _ Release X_ ** and** _ Release X+1_ application.
88
+ - Only data population migrations may be run at this time, if they are needed
89
+ - Must be run as a background task during the Transition phase.
90
+ - Operation is batched or otherwise optimized to ensure the database stays responsive.
91
+ - Schema changes are NOT to be run during this phase.
92
+
93
+ #### Finalization migration
94
+
95
+ The finalization migration removes the temporary measurements that were needed to retain backwards
96
+ compatibility with _ Release X_ , and the database schema henceforth only supports _ Release X+1_ .
97
+ These migrations are run as part of the deployment of _ Release X+2_ .
43
98
44
99
### Example
45
100
46
- Let’ s look at an example, the rename column refactor is shown in the image below.
101
+ Let' s look at an example, the rename column refactor is shown in the image below.
47
102
48
103
![ Rename Column Refactor] ( ./rename-column.gif )
49
104
@@ -73,7 +128,7 @@ actions.
73
128
:::
74
129
75
130
<Tabs >
76
- <TabItem value = " first" label = " First Migration" default >
131
+ <TabItem value = " first" label = " Initial Migration" default >
77
132
78
133
``` sql
79
134
-- Add Column
120
175
```
121
176
122
177
</TabItem >
123
- <TabItem value = " data" label = " Data Migration" >
178
+ <TabItem value = " data" label = " Transition Migration" >
124
179
125
180
``` sql
126
181
UPDATE [dbo].Customer SET
@@ -129,7 +184,7 @@ WHERE FirstName IS NULL
129
184
```
130
185
131
186
</TabItem >
132
- <TabItem value = " second" label = " Second Migration" >
187
+ <TabItem value = " second" label = " Finalization Migration" >
133
188
134
189
``` sql
135
190
-- Remove Column
@@ -173,65 +228,96 @@ END
173
228
</TabItem >
174
229
</Tabs >
175
230
176
- ## Workflow
231
+ ## Deployment orchestration
232
+
233
+ There are some important constraints to the implementation of the process:
234
+
235
+ - Bitwarden Production environments are required to be on at all times
236
+ - Self-host instances must support the same database change process; however, they do not have the
237
+ same always-on application constraint
238
+ - Minimization of manual steps in the process
239
+
240
+ The process to support all of these constraints is a complex one. Below is an image of a state
241
+ machine that will hopefully help visualize the process and what it supports. It assumes that all
242
+ database changes follow the standards that are laid out in [ Migrations] ( ./ ) .
243
+
244
+ ---
245
+
246
+ ![ Bitwarden EDD State Machine] ( ./edd_state_machine.jpg ) \[ Open Image in a new tab for better
247
+ viewing\]
248
+
249
+ ---
177
250
178
- The Bitwarden specific workflow for writing migrations are described below.
251
+ ### Online environments
179
252
180
- ### Developer
253
+ Schema migrations and data migrations as just migrations. The underlying implementation issue is
254
+ orchestrating the runtime constraints on the migration. Eventually, all migrations will end up in
255
+ ` DbScripts ` . However, to orchestrate the running of _ Transition_ and associated _ Finalization_
256
+ migrations, they are kept outside of ` DbScripts ` until the correct timing.
181
257
182
- The development flow is described in [ Migrations] ( ./ ) .
258
+ In environments with always-on applications, _ Transition_ scripts must be run after the new code has
259
+ been rolled out. To execute a full deploy, all new migrations in ` DbScripts ` are run, the new code
260
+ is rolled out, and then all _ Transition_ migrations in the ` DbScripts_transition ` directory are run
261
+ as soon as all of the new code services are online. In the case of a critical failure after the new
262
+ code is rolled out, a Rollback would be conducted (see Rollbacks below). _ Finalization_ migrations
263
+ will not be run until the start of the next deploy when they are moved into ` DbScripts ` .
183
264
184
- ### Devops
265
+ After this deploy, to prep for the next release, all migrations in ` DbScripts_transition ` are moved
266
+ to ` DbScripts ` and then all migrations in ` DbScripts_finalization ` are moved to ` DbScripts ` ,
267
+ conserving their execution order for a clean install. For the current branching strategy, PRs will
268
+ be open against ` master ` when ` rc ` is cut to prep for this release. This PR automation will also
269
+ handle renaming the migration file and updating any reference of ` [dbo_finalization] ` to ` [dbo] ` .
185
270
186
- #### On ` rc ` cut
271
+ The next deploy will pick up the newly added migrations in ` DbScripts ` and set the previously
272
+ repeatable _ Transition_ migrations to no longer be repeatable, execute the _ Finalization_
273
+ migrations, and then execute any new migrations associated with the code changes that are about to
274
+ go out.
187
275
188
- Create a PR moving the future scripts.
276
+ The state of migrations in the different directories at any one time is is saved and versioned in
277
+ the Migrator Utility which supports the phased migration process in both types of environments.
189
278
190
- - ` DbScripts_future ` to ` DbScripts ` , prefix the script with the current date, but retain the
191
- existing date.
192
- - ` dbo_future ` to ` dbo ` .
193
- <bitwarden >
194
- <li >
195
- Create a ticket in Jira with a ` Due Date ` of the release date to ensure future migrations are
196
- merged in and ready to be executed. Set the ticket that created the future migration as a
197
- blocker.
198
- </li >
199
- </bitwarden >
279
+ ### Offline environments
200
280
201
- #### After server release
281
+ The process for offline environments is similar to the always-on ones. However, since they do not
282
+ have the constraint of always being on, the _ Initial_ and _ Transition_ migrations will be run one
283
+ after the other:
202
284
203
- 1 . Run whatever data migration scripts might be needed. (This might need to be batched and executed
204
- until all the data has been migrated)
205
- 2 . After having the server run for a while execute the future migration script to clean up the
206
- database.
285
+ - Stop the Bitwarden stack as done today
286
+ - Start the database
287
+ - Run all new migrations in ` DbScripts ` (both _ Finalization_ migrations from the last deploy and any
288
+ _ Initial_ migrations from the deploy currently going out)
289
+ - Run all _ Transition_ migrations
290
+ - Restart the Bitwarden stack.
207
291
208
292
## Rollbacks
209
293
210
294
In the event the server release failed and needs to be rolled back, it should be as simple as just
211
295
re-deploying the previous version again. The database will ** stay** in the transition phase until a
212
- hotfix can be released, and the server can be updated.
296
+ patch can be released, and the server can be updated. Once a patch is ready to go out, it is
297
+ deployed the _ Transition_ migrations are rerun to verify that the DB is in the state that it is
298
+ required to be in.
213
299
214
- The goal is to resolve the issue quickly and re-deploy the fixed code to minimize the time the
215
- database stays in the transition phase. Should a feature need to be completely pulled, a new
216
- migration needs to be written to undo the database changes and the future migration will also need
217
- to be updated to work with the database changes. This is generally not recommended since pending
218
- migrations (for other releases) will need to be revisited.
300
+ Should a feature need to be completely pulled, a new migration needs to be written to undo the
301
+ database changes and the future migration will also need to be updated to work with the database
302
+ changes. This is generally not recommended since pending migrations (for other releases) will need
303
+ to be revisited.
219
304
220
305
## Testing
221
306
222
307
Prior to merging a PR please ensure that the database changes run well on the currently released
223
308
version. We currently do not have an automated test suite for this and it’s up to the developers to
224
309
ensure their database changes run correctly against the currently released version.
225
310
226
- ## Further Reading
311
+ ## Further reading
227
312
228
- - [ Evolutionary Database Design] ( https://martinfowler.com/articles/evodb.html ) (Particularly
229
- [ All database changes are database refactorings] ( https://martinfowler.com/articles/evodb.html#AllDatabaseChangesAreMigrations ) )
230
- - [ The Agile Data (AD) Method] ( http://agiledata.org/ ) (Particularly
231
- [ Catalog of Database Refactorings] ( http://agiledata.org/essays/databaseRefactoringCatalog.html ) )
232
- - [ Refactoring Databases: Evolutionary Database] ( https://databaserefactoring.com/ )
233
- - Refactoring Databases: Evolutionary Database Design (Addison-Wesley Signature Series (Fowler))
234
- ISBN-10: 0321774515
313
+ 1 . [ Evolutionary Database Design] ( https://martinfowler.com/articles/evodb.html ) (Particularly
314
+ [ All database changes are database refactorings] ( https://martinfowler.com/articles/evodb.html#AllDatabaseChangesAreMigrations ) )
315
+ 2 . [ The Agile Data (AD) Method] ( http://agiledata.org/ ) (Particularly
316
+ [ Catalog of Database Refactorings] ( http://agiledata.org/essays/databaseRefactoringCatalog.html ) )
317
+ 3 . [ Refactoring Databases: Evolutionary Database] ( https://databaserefactoring.com/ )
318
+ 4 . Refactoring Databases: Evolutionary Database Design (Addison-Wesley Signature Series (Fowler))
319
+ ISBN-10: 0321774515
235
320
321
+ [ edd-wiki ] : https://en.wikipedia.org/wiki/Evolutionary_database_design
236
322
[ edd-rfd] :
237
323
https://bitwarden.atlassian.net/wiki/spaces/PIQ/pages/177701412/Adopt+Evolutionary+database+design
0 commit comments