Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cache coherency issues leading to a slowdown #8503

Closed
ThomasWaldmann opened this issue Oct 31, 2024 · 5 comments · Fixed by #8541
Closed

cache coherency issues leading to a slowdown #8503

ThomasWaldmann opened this issue Oct 31, 2024 · 5 comments · Fixed by #8541
Assignees
Labels
Milestone

Comments

@ThomasWaldmann
Copy link
Member

From #8451 (comment) :

borg 2.0.0b12

There's an issue with the ChunkIndex cache if a connection breaks down:

  • borg2 uses a ChunkIndex cache since recently, stored in repository/cache/chunks (and a checksum of it in chunks_hash).
  • if borg2 create does not complete, it does not update that cache.
  • in that case, there is still a valid cache, but it represents the repository contents from when the last backup ended normally and does not know about the chunks transmitted in the interrupted backup run.
  • workaround: if one would kill that cache, borg would rebuild it by listing all objects in the repository (slow, but without much traffic) and it would then represent all currently present chunks in the repo.
@ThomasWaldmann ThomasWaldmann added this to the 2.0.0b14 milestone Oct 31, 2024
@ThomasWaldmann
Copy link
Member Author

ThomasWaldmann commented Nov 1, 2024

There can be similar issues if multiple borg create run in parallel: the last one updating the cache wins and knowledge about existing chunks (which were added by the other borg create runs) might go away, leading to a future slowdown.

It never leads to corruption though, because only borg compact removes chunks (and it uses an exclusive lock while working).

Without the cache, that all would be way simpler. Pity that listing all repo objects takes so long that we need a cache.

@ThomasWaldmann ThomasWaldmann changed the title cache coherency issue when connection breaks down cache coherency issues leading to slowdown Nov 1, 2024
@ThomasWaldmann ThomasWaldmann changed the title cache coherency issues leading to slowdown cache coherency issues leading to a slowdown Nov 1, 2024
@ThomasWaldmann
Copy link
Member Author

ThomasWaldmann commented Nov 5, 2024

Some ideas about how to solve this:

  • after loading the main chunks cache from the repository, chunks.* are merged into the in-memory ChunkIndex, written back to chunks and then chunks.<SAME> are removed.
  • before new chunks are created, existing chunks are marked with F_CLEAN in the in-memory chunkindex. after that, new index entries will be dirty, because they do not have the clean flag.
  • borg create should write new/dirty chunk index entries to repository/cache/chunks.<RANDOM_OR_HASH> periodically and afterwards mark them with F_CLEAN in memory.
  • borg compact builds a new chunk index from scratch and must remove all old cached chunk indexes and write an uptodate chunks.

@ThomasWaldmann
Copy link
Member Author

ThomasWaldmann commented Nov 7, 2024

Even easier, compared to previous post:

  • give up the distinction of a main chunks cache and chunks.* caches, just always store chunks index data as chunks.*.
  • if .* means .HASH, we do not need the chunks_hash extra object anymore.
  • just merge all cached chunk indexes together when building one from the cache

ThomasWaldmann added a commit to ThomasWaldmann/borg that referenced this issue Nov 8, 2024
- doesn't need a separate file for the hash
- we can later write multiple partial chunkindexes to the cache

also:

add upgrade code that renames the cache from previous borg versions.
ThomasWaldmann added a commit to ThomasWaldmann/borg that referenced this issue Nov 8, 2024
- doesn't need a separate file for the hash
- we can later write multiple partial chunkindexes to the cache

also:

add upgrade code that renames the cache from previous borg versions.
ThomasWaldmann added a commit to ThomasWaldmann/borg that referenced this issue Nov 8, 2024
- doesn't need a separate file for the hash
- we can later write multiple partial chunkindexes to the cache

also:

add upgrade code that renames the cache from previous borg versions.
ThomasWaldmann added a commit to ThomasWaldmann/borg that referenced this issue Nov 9, 2024
- doesn't need a separate file for the hash
- we can later write multiple partial chunkindexes to the cache

also:

add upgrade code that renames the cache from previous borg versions.
ThomasWaldmann added a commit that referenced this issue Nov 11, 2024
chunk index cache: use cache/chunks.<HASH>, see #8503
@ThomasWaldmann
Copy link
Member Author

#8531 solves the mentioned issues when running multiple borg create in parallel.

@ThomasWaldmann
Copy link
Member Author

#8541 saves the new stuff from the chunk index every 10 minutes to repo/cache/chunks.*, so progress won't be lost if connection breaks down or borg is ctrl-c'ed.

note: this refers only to the chunk index, so borg will "know" what chunks are in the repo.

the files cache is currently only saved at the end, so that can still be a problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant