Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

accurate mtimes for directories, overwrite symlinks, preserve ownership #217

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

moschroe
Copy link
Contributor

Rough draft on how this could be done.

Accurate mtime for directories

Copies the behaviour of tar where each directory will be queued to have its timestamps modified after all files have been extracted. Otherwise, any directory with any files contained will be modified the instant those files are extracted.

Overwrite symlinks

Minor inconsistency, files were overwritten, symlinks were not. Now, this behaviour is consistent and configurable. (Maybe make the default true?)

Preserve ownership

On unix platforms, perform chmod on all extracted entries. For any uid different from the user of the process, this will only work when running as uid 0 or using fakeroot (if linked dynamically).

@moschroe moschroe marked this pull request as ready for review November 23, 2019 16:28
@alexcrichton
Copy link
Owner

Thanks for the PR!

This looks somewhat invasive though, so can you provide some more rationale for these changes? Why do we want to match the tar CLI, for example? Is there a use case for preserving ownership permissions?

@moschroe
Copy link
Contributor Author

moschroe commented Dec 2, 2019

Thanks for the PR!

You're welcome. I opened this on a whim. I just thought maybe someone else might benefit from my efforts and completely forgot to create an issue or check whether or not PRs were even welcome.

This looks somewhat invasive though, so can you provide some more rationale for these changes? Why do we want to match the tar CLI, for example? Is there a use case for preserving ownership permissions?

It matches behaviour, not the CLI. And frankly, the behaviour is now what I would have expected in the first place. I feel an archiving tool, especially at the layer of a general purpose library, should never modify data on its own.

This might be a matter of bias, as I have experience with forensic work, but to me, getting things out of the box I put them into must never change those things. And differing metadata might matter, eg. when using rsync. Sometimes an archive has the purpose of a time capsule, and especially for testing purposes, exact replication can be required.

If a consumer of this library does not care about fidelity (or deliberately decides not to leak timestamps and ownership), that would be the right context to make this decision.

I currently work on a project that aims for producing a single, statically linked binary and an external tar command would be cumbersome to integrate into the workflow when "a few changes" could make that unnecessary.

PS: Unfortunately, I have no access to a windows system to run tests myself. I was unable to deduce from reading the code where the failure originates.

@akhramov
Copy link

akhramov commented Apr 2, 2020

Would be nice to have an ability to overwrite symlinks. Not that it should be enabled by default, but it seems like some clients out wild expect symlinks to be overwritten (e.g. camallo/dkregistry-rs/issues/149).

@steveej
Copy link

steveej commented Jun 9, 2020

Would be nice to have an ability to overwrite symlinks.

I agree. @moschroe thank you for implementing this! Would you be wiling to split that change to a separate PR? I'm hopeful that will increase the chances of merging this feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants