You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Towards MaterializeInc/database-issues#9191
Today, we have no good way to split the work of compaction into smaller
parts. This presents an issue as datasets and clusters continue to grow
in size. If a compaction takes a significant amount of time there is a
risk that the process running the compaction might not live long enough
(for whatever reason: failure, shutdown, schedule, etc).
This PR aims to improve the situation when dealing with compacting many
shorter runs. We already split the work up into "chunks" based on the
size of the runs but we don't write the work back out into state until
all chunks are complete. This is suboptimal. Imagine a big amount of
compaction is chugging along, 99 of the 100 batches of work are done,
but before the last one can finish the cluster shuts down. All that work
is wasted.
This PR "checkpoints" it's work into state after each chunk is done.
That way in the example above, only the partially finished 100th chunk
is lost. (Incremental work within chunks will be the subject of future
work).
There is a tradeoff here though, it means writing to state more often,
this risks putting CRDB under additional load. We currently seem to
execute 650-750 writes per second to each of our CRDB nodes in us-east-1
on average. There is significant potential risk here. In us-east-1, on
the order of 200 chunks per second are queued up. That means that if
each chunk completes immediately and concurrently, we significantly push
the QPS of our crdb cluster (I think our cluster can handle it based on
resource usage I'm seeing but setting that aside...) I don't think that
every chunk across every environment is going to complete immediately
and concurrently so I think the likely impact on the QPS is likely to be
lower than 200/s. That said we don't have a sense of _per chunk_ timing
so it's harder to estimate specifically. An anecdotal test in staging
didn't reveal any undue load.
If this remains a concern, some form of backpressure could be
implemented to batch applies.
<!--
Describe the contents of the PR briefly but completely.
If you write detailed commit messages, it is acceptable to copy/paste
them
here, or write "see commit messages for details." If there is only one
commit
in the PR, GitHub will have already added its commit message above.
-->
### Motivation
<!--
Which of the following best describes the motivation behind this PR?
* This PR fixes a recognized bug.
[Ensure issue is linked somewhere.]
* This PR adds a known-desirable feature.
[Ensure issue is linked somewhere.]
* This PR fixes a previously unreported bug.
[Describe the bug in detail, as if you were filing a bug report.]
* This PR adds a feature that has not yet been specified.
[Write a brief specification for the feature, including justification
for its inclusion in Materialize, as if you were writing the original
feature specification.]
* This PR refactors existing code.
[Describe what was wrong with the existing code, if it is not obvious.]
-->
### Tips for reviewer
<!--
Leave some tips for your reviewer, like:
* The diff is much smaller if viewed with whitespace hidden.
* [Some function/module/file] deserves extra attention.
* [Some function/module/file] is pure code movement and only needs a
skim.
Delete this section if no tips.
-->
### Checklist
- [ ] This PR has adequate test coverage / QA involvement has been duly
considered. ([trigger-ci for additional test/nightly
runs](https://trigger-ci.dev.materialize.com/))
- [ ] This PR has an associated up-to-date [design
doc](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/README.md),
is a design doc
([template](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/00000000_template.md)),
or is sufficiently small to not require a design.
<!-- Reference the design in the description. -->
- [ ] If this PR evolves [an existing `$T ⇔ Proto$T`
mapping](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/command-and-response-binary-encoding.md)
(possibly in a backwards-incompatible way), then it is tagged with a
`T-proto` label.
- [ ] If this PR will require changes to cloud orchestration or tests,
there is a companion cloud PR to account for those changes that is
tagged with the release-blocker label
([example](MaterializeInc/cloud#5021)).
<!-- Ask in #team-cloud on Slack if you need help preparing the cloud
PR. -->
- [ ] If this PR includes major [user-facing behavior
changes](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/guide-changes.md#what-changes-require-a-release-note),
I have pinged the relevant PM to schedule a changelog post.
0 commit comments