Skip to content
Open
22 changes: 22 additions & 0 deletions api/gather.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
package api

import (
"github.com/chromium/hstspreload/chromium/preloadlist"
)

// gatherLists returns a list of domains filtered by Policy types "bulk-18-weeks" and "bulk-1-year"
func gatherLists(list preloadlist.PreloadList) ([]string, []string) {
Copy link
Collaborator

@lgarron lgarron Jun 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see that this might be useful, but as implemented this is vague and ambiguous in several ways:

  • The return types are not annotated. It's very easy to mis-use them. I would suggest using a map or a struct.
  • The function signature (and comment) do not indicate that this drops entries.

Also:

  • This function is not capitalized, so it's not exported. But it's also not used internally. So it's not doing anything.
  • Since the strings are constants, it would be good to define them as capitalized constants, ideally as an anum.
  • Any reason to return just the domain strings rather than the entries?
  • Any reason to implement this in the hstspreload.org package rather than hstspreload?

I also don't see a stated motivation for this PR. If the goal is to be able to get subsets of the list, would it make just as much sense to provide filters for individual states rather than gathering some subsets at once?

Copy link
Contributor Author

@tKabiawu tKabiawu Jun 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Lucas,

Thanks so much for your comments they were very helpful. I'm currently a step intern working under @nharper and @carlosjoan91, and the project we are working on revolves around automating processes concerning the hsts preload list. I have attempted to make changes to the code that adheres to some of your suggestions by providing functions that filter for individual states rather than gathering some subsets at once, returning entries instead of strings (thereby annotating the return type), capitalizing the function so that it is exportable, and defining constants as enums. Initially, I chose to return just the domain strings rather than the entries because I thought we only needed the domain name to check whether the domain is preloadable with the PreloadableDomain() function. However, I forgot that to change the status of a domain, you also need the value of the includeSubDomains field, which led me to choose to return the entire entry. Moreover, my understanding as the reason this is implemented in the hstspreload.org package instead of the hstspreload is because we plan to use a task scheduling tool that will call the endpoint at regular intervals.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, thanks, I was missing the context! Happy to see work on this project, but I'm glad to leave you in @nharper and @carlosjoan91's capable hands for an actual review. 😄

var domains18weeks []string
var domains1year []string

for _, entry := range list.Entries {
if entry.Policy == "bulk-18-weeks" {
domains18weeks = append(domains18weeks, entry.Name)
}
if entry.Policy == "bulk-1-year" {
domains1year = append(domains1year, entry.Name)
}
}

return domains18weeks, domains1year
}
31 changes: 31 additions & 0 deletions api/gather_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
package api

import (
"reflect"
"testing"

"github.com/chromium/hstspreload/chromium/preloadlist"
)

func TestGatherLists(t *testing.T) {
TestPreloadList := preloadlist.PreloadList{Entries: []preloadlist.Entry{
{Name: "garron.net", Mode: "force-https", IncludeSubDomains: true, Policy: "bulk-18-weeks"},
{Name: "example.com", Mode: "force-https", IncludeSubDomains: false, Policy: "bulk-18-weeks"},
{Name: "gmail.com", Mode: "force-https", IncludeSubDomains: false, Policy: ""},
{Name: "google.com", Mode: "", IncludeSubDomains: false, Policy: "custom"},
{Name: "pinned.badssl.com", Mode: "", IncludeSubDomains: false, Policy: "bulk-1-year"}},
}

expected18weeks := []string{"garron.net", "example.com"}
expected1year := []string{"pinned.badssl.com"}

domains18weeks, domains1year := gatherLists(TestPreloadList)

if !reflect.DeepEqual(domains18weeks, expected18weeks) {
t.Errorf("bulk18week domains does not match expected")
}
if !reflect.DeepEqual(domains1year, expected1year) {
t.Errorf("bulk1year domains does not match")
}

}