Skip to content

Commit 9853105

Browse files
authored
Explainer on the anti-tracking problem for signed packages (#529)
This handwaves some of the network partitioning we'll probably need. Even the "uncredentialed" connection (https://fetch.spec.whatwg.org/#connections) might have included requests that identify the user, so signed packages might need to be retrieved on yet another connection.
1 parent 3f0145a commit 9853105

File tree

1 file changed

+257
-0
lines changed

1 file changed

+257
-0
lines changed

explainers/anti-tracking.md

Lines changed: 257 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,257 @@
1+
# Anti-tracking in signed packages
2+
3+
<!-- TOC depthTo:3 -->
4+
5+
- [Blocking credentials](#blocking-credentials)
6+
- [Possible spellings](#possible-spellings)
7+
- [Tentatively-rejected spellings](#tentatively-rejected-spellings)
8+
- [Blocking normal link decoration](#blocking-normal-link-decoration)
9+
- [Blocking privileged link decoration](#blocking-privileged-link-decoration)
10+
- [Privileged server-side collaboration](#privileged-server-side-collaboration)
11+
- [Appendices](#appendices)
12+
- [Security concerns](#security-concerns)
13+
- [Behavior on an unexpected response](#behavior-on-an-unexpected-response)
14+
15+
<!-- /TOC -->
16+
17+
This explainer attempts to solve the problems identified in issues
18+
[#422](https://github.com/WICG/webpackage/issues/422) and
19+
[#423](https://github.com/WICG/webpackage/issues/423). Specifically, we're
20+
worried that a distributor might be able to use the fact that it can choose
21+
which exchange or package, signed by a publisher, to serve in response to a
22+
client's request, to transfer its notion of the user's identity to the
23+
publisher.
24+
25+
The approach this document takes is to prevent the distributor from accessing
26+
its notion of the user's identity. There are several routes by which a
27+
distributor could learn the user's identity:
28+
29+
1. The browser could send the distributor its top-level credentials. [Countermeasures](#blocking-credentials)
30+
1. The source of a link could use the same link-decoration tools as it might use
31+
in a cross-origin link, which other efforts are blocking. Note that the
32+
source of a link and the distributor of a package could be the same origin.
33+
[Countermeasures](#blocking-normal-link-decoration)
34+
1. The source of a link could decorate links to its distributor to send a user
35+
ID in ways an unrelated cross-origin target couldn't be expected to receive.
36+
[Countermeasures](#blocking-privileged-link-decoration)
37+
1. The source of a link could collaborate with its distributor on the server
38+
side to send a user ID in ways an unrelated cross-origin target couldn't be
39+
induced to collaborate.
40+
[Discussion](#privileged-server-side-collaboration)
41+
42+
These are tackled individually in the following sections.
43+
44+
## Blocking credentials
45+
46+
The navigation to a signed package (or exchange) must be done without
47+
credentials. This is roughly [credentials
48+
mode](https://fetch.spec.whatwg.org/#concept-request-credentials-mode) ==
49+
`"omit"`, with changes to Fetch to make it work for navigations, but it may need
50+
additional changes to segregate the connection from other uncredentialed
51+
requests that might have included identifying information.
52+
53+
There are [security concerns](#security-concerns) for allowing an attacker to
54+
load an arbitrary site without credentials, so if the target of an
55+
uncredentialed navigation isn't a signed package, the navigation should
56+
[fail](#behavior-on-an-unexpected-response).
57+
58+
There are lots of possible spellings for a navigation that omits credentials:
59+
60+
### Possible spellings
61+
62+
The first two options say generically that the referrer can choose whether to
63+
enter a link target with or without credentials. The rest attempt to avoid
64+
changing the semantics of top-level navigations while still getting package
65+
fetches to omit credentials.
66+
67+
<a id="credentials-omit"></a>
68+
69+
#### An attribute to set [credentials-mode](https://fetch.spec.whatwg.org/#concept-request-credentials-mode) to "omit"
70+
71+
```html
72+
<a href="https://target" credentials="omit">
73+
```
74+
75+
This seems like the most straightforward way to say what we currently want, but
76+
it doesn't automatically adapt if we later decide that package navigations
77+
should differ from other navigations in an additional way.
78+
79+
#### An attribute to set the [_init_ parameter to `fetch()`](https://developer.mozilla.org/en-US/docs/Web/API/WindowOrWorkerGlobalScope/fetch#Parameters)
80+
81+
```html
82+
<a href="https://target" fetchoptions='{"credentials": "omit"}'>
83+
<a href="https://target" fetchoptions="credentials: omit">
84+
```
85+
86+
We'd add this to all fetch-causing elements, which would unify the current
87+
`crossorigin=""`, `referrpolicy=""`, and `integrity=""` attributes, and be a
88+
better future extension point. The two hard parts here are
89+
90+
1. Getting people to swallow putting JSON in an attribute, or alternately coming
91+
up with a new microsyntax that has enough capabilities.
92+
2. Checking all of the options for possible security holes.
93+
94+
This is either more attractive than the [`credentials`
95+
attribute](#credentials-omit) because it solves a larger class of problems, or
96+
less attractive because it introduces more possible avenues of abuse.
97+
98+
#### An attribute that declares the target is a package
99+
100+
```html
101+
<a href="https://sxg" ispackage>
102+
```
103+
104+
Just tell the browser directly to expect a package. For now, this would do the
105+
same thing as [credentials="omit"](#credentials-omit) above.
106+
107+
#### Rely on prenavigate
108+
109+
```html
110+
<a href="https://signed_inner_url">
111+
```
112+
113+
If there has already been a
114+
[prenavigate](https://github.com/w3c/resource-hints/issues/82#issuecomment-529951528)
115+
that found a package holding the mentioned URL, it uses the content of that
116+
package for the navigation. If there hasn't been such a prenavigate, it acts
117+
like a normal link.
118+
119+
#### An attribute to announce the expected inner URL
120+
121+
```html
122+
<a href="https://package/path" publisherurl="https://signed/inner/url">
123+
```
124+
125+
An attribute declares the expected `start_url` of the package (or inner url of a
126+
signed exchange). If these don't match, [the navigation
127+
fails](#behavior-on-an-unexpected-response).
128+
129+
<a id="fetchfrom-attribute">
130+
131+
#### An attribute to declare a package that contains the link target
132+
133+
```html
134+
<a href="https://signed/inner/url" fetchfrom="https://package/path">
135+
```
136+
137+
This naturally works on browsers that haven't implemented packages. It also
138+
allows any browser to skip the package if it wants to make sure the publisher is
139+
notified/checked/etc. Again, if the package's content doesn't match the href,
140+
[the navigation fails](#behavior-on-an-unexpected-response).
141+
142+
#### An attribute to declare only a distributor origin
143+
144+
```html
145+
<a href="https://signed/inner/url" distributor="https://distributor.origin">
146+
```
147+
148+
This relies on us defining a single path within `distributor.origin` that serves
149+
a given signed URL, as [Blocking privileged link
150+
decoration](#blocking-privileged-link-decoration) suggests. It has similar
151+
properties to the [`fetchfrom`](#fetchfrom-attribute) option otherwise.
152+
153+
This is neither better or worse from the [`fetchfrom`
154+
attribute](#fetchfrom-attribute) from a technical perspective: `fetchfrom` can
155+
reject paths that don't match the one required by [Blocking privileged link
156+
decoration](#blocking-privileged-link-decoration), and if we ever decide to
157+
relax the single-path restriction, the `distributor` attribute could begin to
158+
allow a full path.
159+
160+
### Tentatively-rejected spellings
161+
162+
#### Re-use the `crossorigin` attribute
163+
164+
```html
165+
<a href="https://target" crossorigin="no-credentials">
166+
```
167+
168+
This would have the same semantics as `<a href="https://target"
169+
credentials="omit">`, but it would also apply to same-origin navigations. It
170+
seems really confusing to use a "crossorigin" attribute to modify same-origin
171+
navigations.
172+
173+
#### Mark packages with a scheme
174+
175+
```html
176+
<a href="package://package">
177+
```
178+
179+
Basically encodes the `ispackage` flag into the scheme of the URL. The package
180+
is fetched by changing the scheme to `https`. We're [developing a more general
181+
package scheme](https://lists.w3.org/Archives/Public/uri/2019Nov/0000.html), but
182+
as it's useful for unsigned packages, which don't necessarily need the same
183+
restrictions as signed package, it seems like the scheme won't do the right
184+
thing.
185+
186+
## Blocking normal link decoration
187+
188+
Same-origin links to packages should use the same restrictions as cross-origin
189+
links to non-packages.
190+
191+
## Blocking privileged link decoration
192+
193+
A link source run by the same entity as the distributor can encode a user ID in
194+
ways that might break an arbitrary publisher's serving code. So, we need to
195+
define a deterministic function from the inner URL to the acceptable URL within
196+
a distributor used to fetch that inner URL's package.
197+
198+
The two main options are
199+
200+
```url
201+
https://distributor.origin/.well-known/package/<hash(innerUrl)>
202+
https://distributor.origin/.well-known/package/innerAuthority/innerPath?innerQuery
203+
```
204+
205+
The hash-based path is a bit easier to compute, but leaving the inner URL mostly
206+
intact makes it easier for a human to guess where the link is going to go.
207+
208+
## Privileged server-side collaboration
209+
210+
The primary mechanism by which a distributor's server could collaborate with a
211+
link source's server is by deciding that if there's a click by user `A` at time
212+
`T` on a link to `https://distributor.example/package`, then a subsequent
213+
uncredentialed load of that package at time `T+ε` is probably user `A`. Servers
214+
can optionally refine this by limiting it to clicks and loads from the same IP
215+
address. This same mechanism allows cross-origin user ID transfer between
216+
arbitrary servers, and we don't know any ways to block it in that context, so we
217+
only need to make the package-assisted transfer at least as difficult as the
218+
cross-server transfer.
219+
220+
We know of two ways for a distributor to pass their user ID onto the publisher
221+
of a package, and in either case—the publisher shares their signing key with the
222+
distributor, or the publisher generates 2^N variants of a package to encode N
223+
bits of a user ID—it seems safer or simpler for the publisher to share their
224+
logs instead.
225+
226+
## Appendices
227+
228+
### Security concerns
229+
230+
There are worries that uncredentialed navigation could be used in an attack on
231+
the target site, since the main request won't send or save cookies, but
232+
subrequests will. The only concrete attacks we know of involve surprising users
233+
instead of stealing data or executing malicious code, but we'd like to prevent
234+
those surprises anyway.
235+
236+
Similar worries came up in the discussion of cross-origin prefetch, which led to
237+
the new `"prenavigate"` operation, for which the target site opts in using an
238+
[`Allow-Uncredentialed-Navigation`
239+
header](https://github.com/w3c/resource-hints/issues/82#issuecomment-529951528).
240+
241+
We think the act of creating a signed exchange is an equivalent opt-in.
242+
243+
### Behavior on an unexpected response
244+
245+
If a link declares that its target is a package or has a particular inner URL,
246+
and then the actual response isn't a package or has a different inner URL, what
247+
should the browser do?
248+
249+
* If it fails the navigation with a network error, link sources would quickly
250+
discover their error and correct it, but it means that a distributor could
251+
never change the content of a URL that once contained a package.
252+
* If it reloads the outer URL as a normal navigation or redirects to the claimed
253+
inner URL, that wastes network traffic and time, but gives users a smoother
254+
experience.
255+
256+
Security folks generally prefer to fail fast, so the rest of this explainer
257+
suggests that option.

0 commit comments

Comments
 (0)