Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Vendoring dependencies #642

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

RFC: Vendoring dependencies #642

wants to merge 1 commit into from

Conversation

winterqt
Copy link

@winterqt winterqt commented Oct 8, 2022

Copy link
Contributor

@ljharb ljharb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Vendoring is a resoundingly bad practice that npm should not in any way encourage or ease - there's a reason checking node_modules into version control is similarly discouraged.

With a lockfile, and an internal registry, there should be no issues with reproducibility I'm aware of. Can you elaborate on what I might be missing here?


To my knowledge, there are two alternatives:
1. Ship `node_modules`. This would work a good percent of the time, assuming `npm ci --ignore-scripts` is used, and all packages are installed across all operating systems.
2. Generate a cache, and point npm to that. Given that the cacache format hasn't changed in a number of years, nor has what npm stored in cache entry metadata, I'd say this is a pretty safe bet. However, I feel uneasy about relying on it, as it could break at any time.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the third option is the best practice for a long time - use a lockfile, and an internal registry, and any resulting differences between systems of node_modules are intentional and necessary.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that there are differences between different node_modules, but having to run an internal registry certainly begs for a more lightweight solution, IMHO. (Why use an internal registry over the public one, assuming all of your dependencies are there, if you have a lockfile?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because deploying from the internet is hugely unsafe and exposes you to DNS/man-in-the-middle attacks?

Any discussion about reproducibility rests on first having no uncontrolled sources of content.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this the case with a known-good lockfile, which contains hashes for all dependencies, though?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fair, but you still don't want the contents denied to you entirely - that breaks your build too.

@winterqt
Copy link
Author

winterqt commented Oct 8, 2022

Can you elaborate on what I might be missing here?

Sure.

The motivation for this RFC is my effort to rework how we package npm projects in Nixpkgs -- which presents certain challenges surrounding reproducibility.

The current solution used is to generate large files (warning: 6 MB file) containing dependency trees.

I've proposed an alternate solution which does what I've described as the second alternative -- generating a cacache using a small, purpose-built fetcher, which only relies on a lockfile. (I've done it this over e.g. shelling out to npm to further ensure reproducibility.)

This has proven to work very well for this use case, which is great. But as I said, relying on this is a worse idea than an official solution, since the (historically very low) chances of breakage are there.

We currently use similar solutions for Go, Cargo, and Yarn (v1), utilizing the aforementioned vendoring solutions for the first two, while a home-grown(-ish?) solution for Yarn.

(I'm guessing this comes off as selfish if I'm only doing it for this reason, but I also believe that Go and Cargo must have implemented this for a good reason.)

@winterqt
Copy link
Author

winterqt commented Oct 8, 2022

Vendoring is a resoundingly bad practice that npm should not in any way encourage or ease - there's a reason checking node_modules into version control is similarly discouraged.

Why do you(/others, I'm assuming) believe this? Assuming the vendored dependencies include any and all dependencies that are required for the project, no matter the OS etc, it provides more reproducibility than checking in node_modules.

@ljharb
Copy link
Contributor

ljharb commented Oct 8, 2022

Reproducibility isn't remotely more important than a manageable git repository or a navigable/comprehensible git diff/log.

@winterqt
Copy link
Author

winterqt commented Oct 8, 2022

Sure, but alternate uses exist, which is evident by Cargo producing a tarball as opposed to a folder.

@wraithgar
Copy link
Member

Is this functionally any different than including all of your dependencies as bundled and running npm pack?

@winterqt
Copy link
Author

I don't believe so, except for the fact that npm pack depends on the current state of the node_modules folder. Assuming npm ci --ignore-scripts wasn't used for installation, this could introduce impurities depending on what the install/prepare scripts for the dependencies do (e.g. grabbing a prebuilt version of a native dependency).

I think the two serve different purposes, though, and one doesn't require modifying existing projects.

@wraithgar
Copy link
Member

npm pack mirrors as exactly as possible building the tgz file that gets attached during npm publish.

If there is a potential for having things bundled that are missing, this should be solved in the pack/publish process itself so that npm pack becomes as reliable as you need it to be. I don't know what other problem this RFC would be solving, so let's solve that problem itself (if indeed it is determined there is one)

@winterqt
Copy link
Author

npm pack mirrors as exactly as possible building the tgz file that gets attached during npm publish.

Right, but it still depends on the local state of the node_modules folder, no?

@wraithgar
Copy link
Member

Right, but it still depends on the local state of the node_modules folder, no?

Yep. And if that's a problem, then that's the problem to fix.

@winterqt
Copy link
Author

Looks like I made a pretty dumb factual error here -- both Cargo and Go use folders.

I'll push a revision fixing that, as well as also mentioning npm pack as an alternative.

@winterqt
Copy link
Author

Fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants