feat: Implement ecosystem enumeration in Go#5140
Hidden character warning
feat: Implement ecosystem enumeration in Go#5140michaelkedar wants to merge 7 commits intogoogle:masterfrom
Conversation
|
|
||
| func (e packagistEcosystem) GetVersions(_ string) ([]string, error) { | ||
| panic("not yet implemented") | ||
| func packagistAPIURL(pkg string) string { |
There was a problem hiding this comment.
that reminds me, I think ideally we should (as a follow up PR) have this logic mirror the linter to account for alternative repos i.e. Drupal
(assuming I've correctly understood that this is the right place to be doing that)
|
|
||
| // GetVersions enumerates GHC versions. | ||
| // Different components of GHC are part of the same software release, so we ignore the package name. | ||
| func (e ghcEcosystem) GetVersions(_ string) ([]string, error) { |
There was a problem hiding this comment.
Probably explain here that GHC ecosystem is just essentially one package.
There was a problem hiding this comment.
According to the schema, GHC is
The Haskell compiler ecosystem. The
namefield is the name of a component of the GHC compiler ecosystem (e.g., compiler, GHCI, RTS).
So it's not just one package, but each of the packages are released together with the new compiler release, which I guess is what the comment was already saying (which was transplanted from Python)
| "testing" | ||
| ) | ||
|
|
||
| func TestGHC_GetVersions(t *testing.T) { |
There was a problem hiding this comment.
All these tests should have t.Parallel(), since they are all in the ecosystem package, we don't want to test sequentially.
There was a problem hiding this comment.
The tests all call setupHTTPClientForTest which messes with the global http client to use the go-vcr client.
We can't run them in parallel because they'd all interfere with each other.
I don't really want to have to pass in a http client to GetVersions
There was a problem hiding this comment.
do we know how long this test is taking right now? Another option is storing the httpClient (or just a httpClientOverride) in each ecosystem struct.
There was a problem hiding this comment.
On the latest Cloud Build run on this PR it took 21.617s, which I think is tolerable, but if you don't like it I can try adding a per-struct client
There was a problem hiding this comment.
I did some factoring to add a Provider to all the ecosystem implementations to allow the http client to be set separately. Running the tests in parallel now took 11.195s.
PTAL
There was a problem hiding this comment.
hmm that still feels slow no? Especially with everything cassetted it should be almost instant...
There was a problem hiding this comment.
idk - it does have to load megabytes of test data
go/osv/ecosystem/util.go
Outdated
| } | ||
|
|
||
| // getVersionsDepsDev enumerates versions for a package using the deps.dev API. | ||
| func getVersionsDepsDev(e Ecosystem, depsDevSystem string, pkg string) ([]string, error) { |
There was a problem hiding this comment.
Should we be using the depsdev package for grpc requests?
Also replace depsDevSystem with the depsdev enum maybe
There was a problem hiding this comment.
Using the REST API lets us use go-vcr for tests, which I think I'd prefer.
| proto_major: 1 | ||
| proto_minor: 1 | ||
| content_length: 215 | ||
| body: "\uFEFF\x3C\x3F\x78\x6D\x6C\x20\x76\x65\x72\x73\x69\x6F\x6E\x3D\"\x31\x2E\x30\"\x20\x65\x6E\x63\x6F\x64\x69\x6E\x67\x3D\"\x75\x74\x66\x2D\x38\"\x3F\x3E\x3C\x45\x72\x72\x6F\x72\x3E\x3C\x43\x6F\x64\x65\x3E\x42\x6C\x6F\x62\x4E\x6F\x74\x46\x6F\x75\x6E\x64\x3C\x2F\x43\x6F\x64\x65\x3E\x3C\x4D\x65\x73\x73\x61\x67\x65\x3E\x54\x68\x65\x20\x73\x70\x65\x63\x69\x66\x69\x65\x64\x20\x62\x6C\x6F\x62\x20\x64\x6F\x65\x73\x20\x6E\x6F\x74\x20\x65\x78\x69\x73\x74\x2E\n\x52\x65\x71\x75\x65\x73\x74\x49\x64\x3A\x36\x31\x38\x62\x33\x30\x36\x35\x2D\x39\x30\x31\x65\x2D\x30\x30\x33\x66\x2D\x31\x32\x39\x38\x2D\x62\x64\x37\x35\x35\x34\x30\x30\x30\x30\x30\x30\n\x54\x69\x6D\x65\x3A\x32\x30\x32\x36\x2D\x30\x33\x2D\x32\x37\x54\x30\x33\x3A\x31\x39\x3A\x32\x30\x2E\x34\x30\x38\x31\x38\x36\x38\x5A\x3C\x2F\x4D\x65\x73\x73\x61\x67\x65\x3E\x3C\x2F\x45\x72\x72\x6F\x72\x3E" |
There was a problem hiding this comment.
Hmm bit odd this is completely binary encoded when most of these are ascii characters
There was a problem hiding this comment.
hallucinated conversion:
<?xml version="1.0" encoding="utf-8"?>
<Error><Code>BlobNotFound</Code><Message>The specified blob does not exist.
RequestId:618b3065-901e-003f-1298-bd7554000000
Time:2026-03-27T03:19:20.4081868Z</Message></Error>
| "testing" | ||
| ) | ||
|
|
||
| func TestNuGet_GetVersions(t *testing.T) { |
There was a problem hiding this comment.
I'm guessing this tests both paging logic (page.Items and page.[No]Items)? The cassette is so big it seems impossible to see.
There was a problem hiding this comment.
tbh I'm not actually sure - these test cases were copied from Python
There was a problem hiding this comment.
oh 😅
The python one is a similar size so that's extra nice
Co-authored-by: Rex P <106129829+another-rex@users.noreply.github.com>
| "testing" | ||
| ) | ||
|
|
||
| func TestGHC_GetVersions(t *testing.T) { |
There was a problem hiding this comment.
hmm that still feels slow no? Especially with everything cassetted it should be almost instant...
|
|
||
| // Get returns an ecosystem for the given ecosystem name. | ||
| // If the ecosystem is not found, it returns nil, false. | ||
| func (p *Provider) Get(ecosystem string) (Ecosystem, bool) { |
There was a problem hiding this comment.
Hmm this feels very... circular.
You use the provider to get the ecosystem which uses the provider to call fetch.
Few ideas, not sure about these:
- The functions in the ecosystems map seems very ugly
- Maybe we can use an interface for setting the providers on construction? (this might be too much work for not really much gain)
- An option for defaulting to using the default http client would be to change the type of Provider from a pointer to just the struct
Provider(For both in these signatures and each ecosystem field). And in the fetchJSON function we can do a check to see if client pointer exists, if not use the default http client.
There was a problem hiding this comment.
Hmm this feels very... circular.
You use the provider to get the ecosystem which uses the provider to call fetch.
It's more like the Provider is the environment/context that the ecosystems exist in, so the ecosystems retain a handle on the environment that produced them to make requests within that environment.
The functions in the ecosystems map seems very ugly
Agreed, but setting up the interface adds more boilerplate that's not worth it IMO.
I would love it if I could have a generic function to do it, but go generics won't let you set fields on a generic struct.
An option for defaulting to using the default http client...
I feel like having the DefaultProvider is sufficient for this.
Translated all the ecosystem queries for version enumeration into go