-
Notifications
You must be signed in to change notification settings - Fork 18
Open
Description
I just had a revelation:
$ time b3sum dreamshaper_8\ \(1\).safetensors dreamshaper_8.safetensors
771c807db56dbfc33feda5638d920f6c507db971da44772ee44a08dc38c3b437 dreamshaper_8 (1).safetensors
771c807db56dbfc33feda5638d920f6c507db971da44772ee44a08dc38c3b437 dreamshaper_8.safetensors
real 0m0.172s
user 0m2.193s
sys 0m0.423s
$ time cmp dreamshaper_8\ \(1\).safetensors dreamshaper_8.safetensors
real 0m0.596s
user 0m0.183s
sys 0m0.411s
$ time diff dreamshaper_8\ \(1\).safetensors dreamshaper_8.safetensors
real 0m0.509s
user 0m0.079s
sys 0m0.428s
As you can see, even though the b3sum
method has an additional cost (calculating a hash) it is way faster overall since it's leveraging parallelism.
Wouldn't it be a good improvement to bring parallelism to some of the tools like diff
and cmp
?
Maybe with a new (not-standardized) option?
Maybe by default because why not?
I guess diff
has a special code path once it is sure that it's just a binary file, right? So in that code path it wouldn't be much of a problem to parallelize it.
This whole topic can even be pushed further when comparing directories... parallel diffing of files.
Come on it's 2025! :)
Metadata
Metadata
Assignees
Labels
No labels