[RFC] Safety limits by bmcase · Pull Request #309 · w3c/attribution

bmcase · 2025-11-12T02:26:54Z

Creating a PR to add safety limits to the Attribution spec. This is based primarily on the BigBird algorithm from Section 4 of this paper https://arxiv.org/pdf/2506.05290. Algorithm 2 is the main algorithm that encompass both budget deduction and safety limit deduction.

This PR is still WIP but ready for some initial review.

Intended to address this open issue #237

Preview | Diff

This adds the checks that need to happen on user action context, following Alg 2 of BigBird; not that it follows the latest version which has conversion check moved within the for loop over epochs.

in Algo 2 in Big Bird, safety limit deductions occur if and only if privacy budget also happens. Thus going to put the safety limits into the deduct privacy budget function (renamed as deduct privacy and safety budgets).

api.bs

bmcase · 2025-12-05T16:55:23Z

User actions quota counts -- for the spec I think we should stick closely to the paper on the safety limit quotas themselves, but for the user action quota counts I think we could do a simplification.

The paper partitions the quota counts for a single user action by impression and conversion quotas with conversion quotas further partitioned by epoch.

I think it would be simpler to just have one single quota count per user action. If we think that is okay for now I can simplify this PR a bit.

Let me know if you have any thoughts on this @apasel422 @mt @csharrison @andyleiserson.

api.bs

bmcase · 2025-12-11T21:50:21Z

Notes from meeting:

drop user action stores by many dimensions
drop conversion site quota

Follow ups:

set minimum recommended multipliers
clear history
locking for atomic transaction

remove conversion site quota and remove the store of user action contexts, replacing with a global boolean flag attached to the window

create function to calculate deductions for impression sites.

the simpler version was more than lacking optimizations; it would have under deducted in the single epoch but multiple impressions site case.

api.bs

@apasel422

incorporate @apasel422 's feedback

bmcase · 2026-02-10T21:42:29Z

@apasel422 thanks for the review! I think I incorporated all of you feedback

api.bs

bmcase · 2026-02-12T05:03:42Z

@martinthomson I updated the user activation check to throw an exception instead of return a boolean, if you want to look anymore at that. Replied on a couple open comment threads.

@apasel422 thanks for the second round of edits; I updated the PR with all of those except one I want to look into more.

api.bs

adding more prose to describe what we are doing with requiring user activation and some limitations with that.

bmcase · 2026-02-13T03:20:57Z

api.bs

+that could be maliciously triggered.
+
+
+### Attribution API Activation ### {#s-api-activation}


@martinthomson I added some more explanation to go with this section and links out to the HTML spec. Can you see if this captures what we want to say here?

martinthomson

I'm not following the whole business of impression quota computation. It might even be wrong. I think that this could be a lot simpler in that area.

api.bs

martinthomson · 2026-02-13T03:57:55Z

api.bs

+1.  Since calling the Attribution API consumes a user activation, the site would no longer have this
+    particular user activation to use for other APIs (e.g., opening popups).


Thankfully, this problem can be fixed by doing the hard work of doing our own activation tracking, which I think is going to be necessary.

martinthomson · 2026-02-13T04:01:04Z

api.bs

+<p class=note>This approach allows a single user action to enable multiple
+API invocations within the same session, while still requiring
+an initial user gesture to activate the API.


Suggested change

<p class=note>This approach allows a single user action to enable multiple

API invocations within the same session, while still requiring

an initial user gesture to activate the API.

<p class=note>This approach allows a single user action

to enable multiple API invocations within the same session,

while only making the API available to one site

per [=activation triggering input event|activation=].

where should we define this term activation triggering input event?

api.bs

martinthomson · 2026-02-13T04:28:32Z

api.bs

+1.  If the [=impression site quota store=] does not [=map/contain=] |impressionQuotaKey|
+    and |siteDeduction| is greater than the [=impression site quota per epoch=], return false.


Fix the indent.

Also, the above comments.

martinthomson · 2026-02-13T04:32:36Z

api.bs

-    in the [=privacy budget store=].
+1.  Let |epoch| be the [=epoch index=] component of |key|.

 1.  Let |sensitivity| be |l1Norm| if |l1Norm| is non-null, 2 * |value| otherwise.


This bit duplicates the computation you have in the impression store bit. I think we can factor out a "compute privacy budget deduction" process for finding the number.

don't quite follow this suggestion

The overall algorithm appears to compute a deduction amount twice. That's the part that I think can be factored differently.

I think that - at least for now - we want to have a single number that is deducted from all active budgets.

yeah, we sort of do compute the deduction twice overall, once for the individual sensitivity in the single epoch case and then again for a global sensitivity in the case of multiple epochs.

api.bs

address mt PR feedback and remove impression site map though we might want it back in the future to support w3c#377

api.bs

martinthomson

Some more (major) problems. Sorry for not noticing these earlier.

martinthomson · 2026-02-20T00:48:23Z

api.bs

+*   The [=global privacy budget store=] records the state
+    of the per-[=epoch=] global [=privacy budget=]
+    that applies across all [=sites=].


I just realized that there is another bug here.

The global privacy store is being indexed by an epoch index that only has relevance on a per-site basis.

For sites, we don't really try to hide when epochs start, though we don't publish a value either. If we have a single global store, we have to have a single value for when the store starts.

The obvious thing to do is pick a starting point when the global store is first used, but that leads to an interesting question: if this value might leak, then how do we prevent that from being used as a supercookie? After all, when clearing state, we want to retain the global budget state so that clearing state doesn't make privacy worse.

Another option is to align the cycle to a fixed point in time that is the same for everyone. That might work. We don't expect the limit to ever be hit, except for the very few people who have unusually highly active conversion use across many sites. The effect on skewing results might then be diffuse enough that sites using the API won't need to consider it.

So that would be my suggestion. When accessing this, use a value derived from a fixed reference point (the unix epoch is available) rather than the site-specific value.

That leads me to the next problem...

yeah this is a good point. We've generally consider site-independent epoch start times as an option for epoch start time on the per-site budgets, but that doesn't let you align them all with the global epoch.

Fixing the global epoch for everyone and keeping per-site epoch start times independent seems reasonable. It probably throws some issues in the theory proofs, but at least it's implementable.

continuing the discussion on #386

martinthomson · 2026-02-20T00:56:52Z

api.bs

+*   The [=impression site quota store=] records the state
+    of per-[=impression site=] and per-[=epoch=] quota [=privacy budgets=].


The impression site quota store is indexed incorrectly as well. It uses the per-site "privacy budget key" and epoch.

For this, I think that we need to consider extensions to the epoch start store for impressions.

The alternative would be to reuse the epoch start store, so that impression site quotas refresh at exactly the same time as the per-site budget for that site. I can't see why that would be wrong, at least offhand. However, the quota is a cross-site store, which means it might require safeguards we haven't really considered for the per-site store. We essentially need to prevent the value from leaking, which isn't something we really try to do for the per-site budget (because it's just a random number that we generate for each site).

Either way, implementing this is tricky, because the key to this store will end up covering different periods. What is a single epoch query for one site might cover two epochs in the quota store.

Yes, you're right the impression site quotas should be indexed by the epoch in which the impression(s) were stored. That should be doable to adjust; we need to get the epochs for every impression for a site.

We essentially need to prevent the value from leaking

Do you mean we don't want to leak the start time of the per-impression site quota epoch start time?

That's right. The time that the per-impression site quota epoch starts will be the same across websites, which means that it will be a unique identifier for a browser.

Thinking more about this, there are two things to work out:

When to start each epoch. As a safeguard, we might assume that it isn't operating often, so the actual alignment of epochs won't need to be evenly distributed in the same way that we distribute the per-site budget. In that spirit, we might be able to do the same as what is done for the global budget: align it to a fixed point. Like the global budget, that can't be a per-browser fixed value, as that risks leaking a per-browser identifier, but we might be able to fix a value in the spec.

What to do about the "single epoch" queries that don't end up hitting a single epoch against the per-impression site quota. I'm less sure about this part. It's tempting to suggest that we ignore the inter-epoch interactions for the quota. That would mean that if the per-site budget believes something to be single-quota, then that would affect impression site quotas less than if the per-site budget had to span two epochs.

The time that the per-impression site quota epoch starts will be the same across websites, which means that it will be a unique identifier for a browser.

But we would never expose this directly to any websites; it's just private internal state for the browser. They could try to learn about it through DP queries indirectly.

In that spirit, we might be able to do the same as what is done for the global budget: align it to a fixed point.

Yes, setting the quota start times to be the same as the global makes sense as they are quotas for dispersing it so makes sense to have them aligned with the global budget.

In what I said earlier

Yes, you're right the impression site quotas should be indexed by the epoch in which the impression(s) were stored. That should be doable to adjust; we need to get the epochs for every impression for a site.

I think we are doing this already. because in do attribution and fill a histogram we loop over all epochs in the attribution window, For each |epoch| from |startEpoch| to |currentEpoch|, inclusive:

I think what we are doing right now is essentially to assume that all epochs are on the same cadence: per-site, global, quotas. That's the only way this for loop makes sense from |startEpoch| to |currentEpoch|.

If we change to have epoch cadence as:

global, impression quotas on a fixed cadence

per-site on a site independent cadence starting from a first visit

they I think we will need some way to map the attribution window look back into the |epochs| for each per-site budget that would be considered.

Is it possible to just keep everything on a fixed cadence for now an punt on having per-site epochs being independent?

A fixed cadence for quotas and global is easiest, but you'll have to restructure the code a little. It will need the time when you are accessing multiple stores.

The basic rule is: Every time you index one of the epoch-based stores, you will need to translate |now| into an epoch specific to that store.

bmcase added 5 commits November 12, 2025 09:33

Update api.bs

7a4c504

update safety limit draft

3db4368

update safety limit draft

a8c6b8d

add user action checks

f217ef5

This adds the checks that need to happen on user action context, following Alg 2 of BigBird; not that it follows the latest version which has conversion check moved within the for loop over epochs.

have budget and safety checks together

3339298

in Algo 2 in Big Bird, safety limit deductions occur if and only if privacy budget also happens. Thus going to put the safety limits into the deduct privacy budget function (renamed as deduct privacy and safety budgets).

bmcase mentioned this pull request Dec 4, 2025

Overly conservative check and deduct for budget #336

Open

safety limit and privacy deductions iff all can occur.

140ae91

bmcase commented Dec 5, 2025

View reviewed changes

api.bs Show resolved Hide resolved

clean up

5527b8d

bmcase changed the title ~~[WIP] Safety limits~~ [RFC] Safety limits Dec 5, 2025

This comment was marked as spam.

Sign in to view

martinthomson reviewed Dec 11, 2025

View reviewed changes

api.bs Outdated Show resolved Hide resolved

martinthomson reviewed Dec 11, 2025

View reviewed changes

api.bs Outdated Show resolved Hide resolved

martinthomson reviewed Dec 11, 2025

View reviewed changes

api.bs Outdated Show resolved Hide resolved

bmcase added 10 commits February 6, 2026 10:06

Merge branch 'main' into safety_limits

b4b0575

remove conversion site quota and store for user actions

7dd5ecd

remove conversion site quota and remove the store of user action contexts, replacing with a global boolean flag attached to the window

deductions for impression sites

c42e70c

create function to calculate deductions for impression sites.

move safety configuration into implementation defined values

ec0f4c9

handle cases in compute impression site deductions

d0395ed

the simpler version was more than lacking optimizations; it would have under deducted in the single epoch but multiple impressions site case.

pass parameters

84fce81

update function signature

55634d3

fix some bikeshed links

61ff7e7

fix link

1721e6d

update user activation checks on measure conversion

a95b7b1

bmcase mentioned this pull request Feb 9, 2026

add user activation check on save impression #369

Closed

bmcase added 2 commits February 9, 2026 15:12

clean up errors and user activation

0997b83

clean up warnings

737e81c

apasel422 reviewed Feb 10, 2026

View reviewed changes

api.bs Outdated Show resolved Hide resolved

incorporate feedback

cc0f86a

incorporate @apasel422 's feedback

apasel422 requested changes Feb 10, 2026

View reviewed changes

apasel422 requested changes Feb 11, 2026

View reviewed changes

api.bs Outdated Show resolved Hide resolved

api.bs Outdated Show resolved Hide resolved

apasel422 reviewed Feb 11, 2026

View reviewed changes

api.bs Outdated Show resolved Hide resolved

apasel422 mentioned this pull request Feb 11, 2026

Add safety limits to simulator #376

Draft

apasel422 reviewed Feb 11, 2026

View reviewed changes

api.bs Outdated Show resolved Hide resolved

bmcase added 3 commits February 11, 2026 22:55

change check attribution API activation to throw

51f2d25

address Andrew's feedback on PR

2892064

fix checks

2ca92de

apasel422 requested changes Feb 12, 2026

View reviewed changes

api.bs Outdated Show resolved Hide resolved

api.bs Outdated Show resolved Hide resolved

description of user activation for the API

7b1cf1a

adding more prose to describe what we are doing with requiring user activation and some limitations with that.

bmcase commented Feb 13, 2026

View reviewed changes

bmcase added 5 commits February 12, 2026 22:53

remove impressionsBySite map

36b54e8

address PR feedback

f2daa48

fix link

d164cf7

fix errors

f740ef9

use set of impressionSites

580473a

martinthomson requested changes Feb 13, 2026

View reviewed changes

martinthomson linked an issue Feb 13, 2026 that may be closed by this pull request

Add global privacy budget and per-impression-site quotas #237

Open

martinthomson mentioned this pull request Feb 13, 2026

HTML's user activation modes don't work for us #378

Open

bmcase added 3 commits February 19, 2026 14:24

remove map of impression site deductions

8ad8c97

address mt PR feedback and remove impression site map though we might want it back in the future to support w3c#377

fix checks and links

3af4cf4

fix check

d18e5eb

apasel422 requested changes Feb 19, 2026

View reviewed changes

api.bs Show resolved Hide resolved

api.bs Show resolved Hide resolved

martinthomson linked an issue Feb 19, 2026 that may be closed by this pull request

Overly conservative check and deduct for budget #336

Open

martinthomson reviewed Feb 20, 2026

View reviewed changes

address feedback

a4d2d2c

		that could be maliciously triggered.


		### Attribution API Activation ### {#s-api-activation}

		1. Since calling the Attribution API consumes a user activation, the site would no longer have this
		particular user activation to use for other APIs (e.g., opening popups).

		1. If the [=impression site quota store=] does not [=map/contain=] \|impressionQuotaKey\|
		and \|siteDeduction\| is greater than the [=impression site quota per epoch=], return false.

		* The [=impression site quota store=] records the state
		of per-[=impression site=] and per-[=epoch=] quota [=privacy budgets=].

Comments

Conversation

bmcase commented Nov 12, 2025 • edited by pr-preview bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

bmcase commented Dec 5, 2025

Uh oh!

This comment was marked as spam.

Uh oh!

This comment was marked as spam.

Uh oh!

Uh oh!

Uh oh!

bmcase commented Dec 11, 2025

Uh oh!

Uh oh!

bmcase commented Feb 10, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

bmcase commented Feb 12, 2026

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

martinthomson left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

martinthomson left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bmcase Feb 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

bmcase commented Nov 12, 2025 •

edited by pr-preview bot

Loading

bmcase Feb 20, 2026 •

edited

Loading