Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

setcookie should overwrite cookie header with same info #18169

Open
kkmuffme opened this issue Mar 28, 2025 · 10 comments
Open

setcookie should overwrite cookie header with same info #18169

kkmuffme opened this issue Mar 28, 2025 · 10 comments

Comments

@kkmuffme
Copy link

Description

setcookie() currently just adds another HTTP header (= like calling header( 'Set-Cookie: ...', false ); in all cases.

This means if the same cookie (= functionaly identical, e.g. same name+path+domain+...) is set multiple times (e.g. different value/expiration or also for whatever reason same value/expiration) it will just add another Set-Cookie header.

Since HTTP headers cannot be compressed* or compression happens upstream, this causes significant performance downsides:

  • upstream needs to allocate larger buffers (nginx fastcgi_buffer) that exceed 1 or 2 memory pages for the odd request that has this issue, leading to significantly higher memory consumption just for the rare request that has this issue. e.g. only 0.0001% of requests send headers that exceed 4k (one memory page). Those requests send headers with an average size of 12k and a maximum of 16k.
    This means that a buffer size of 16k has to be allocated, which means that if we can serve up to 100000 requests simultaneously (= 2000 pages that each contain ~50 js/css/images) that's more than 1GB of memory unnecessarily allocated/wasted just to ensure that the 1 request that has this issue to not cause upstream sent too big header while reading response header from upstream

  • the overall page load can take significantly longer - e.g. HTTP headers are 6k and compressed page is 6k (assuming the page is served from a PHP page cache to ignore any other effects). If the set-cookie headers are deduplicated it's HTTP headers 0.5k + 6k effectively reducing the overall page load time by almost 50% in this example:

Date: Fri, 28 Mar 2025 08:44:58 GMT
Content-Type: application/json; charset=UTF-8
X-Content-Type-Options: nosniff
Access-Control-Expose-Headers:
Access-Control-Allow-Headers:
Set-Cookie: some_test_foo_bar_cookie_d41d8cd98f00b204e9870998ecf8421e=at; expires=Sat, 29-Mar-2025 08:44:59 GMT; Max-Age=86400; path=/; secure
Set-Cookie: some_test_foo_bar_cookie_d41d8cd98f00b204e9870998ecf8421e=nl; expires=Sat, 29-Mar-2025 08:44:59 GMT; Max-Age=86400; path=/; secure
Set-Cookie: some_test_foo_bar_cookie_d41d8cd98f00b204e9870998ecf8421e=be; expires=Sat, 29-Mar-2025 08:44:59 GMT; Max-Age=86400; path=/; secure
Set-Cookie: some_test_foo_bar_cookie_d41d8cd98f00b204e9870998ecf8421e=nl; expires=Sat, 29-Mar-2025 08:44:59 GMT; Max-Age=86400; path=/; secure
Set-Cookie: some_test_foo_bar_cookie_d41d8cd98f00b204e9870998ecf8421e=bn; expires=Sat, 29-Mar-2025 08:44:59 GMT; Max-Age=86400; path=/; secure
Set-Cookie: some_test_foo_bar_cookie_d41d8cd98f00b204e9870998ecf8421e=nl; expires=Sat, 29-Mar-2025 08:44:59 GMT; Max-Age=86400; path=/; secure
Set-Cookie: some_test_foo_bar_cookie_d41d8cd98f00b204e9870998ecf8421e=cf; expires=Sat, 29-Mar-2025 08:44:59 GMT; Max-Age=86400; path=/; secure
Set-Cookie: some_test_foo_bar_cookie_d41d8cd98f00b204e9870998ecf8421e=nl; expires=Sat, 29-Mar-2025 08:44:59 GMT; Max-Age=86400; path=/; secure
Set-Cookie: some_test_foo_bar_cookie_d41d8cd98f00b204e9870998ecf8421e=ch; expires=Sat, 29-Mar-2025 08:44:59 GMT; Max-Age=86400; path=/; secure
Set-Cookie: some_test_foo_bar_cookie_d41d8cd98f00b204e9870998ecf8421e=nl; expires=Sat, 29-Mar-2025 08:44:59 GMT; Max-Age=86400; path=/; secure
Set-Cookie: some_test_foo_bar_cookie_d41d8cd98f00b204e9870998ecf8421e=da; expires=Sat, 29-Mar-2025 08:44:59 GMT; Max-Age=86400; path=/; secure
Set-Cookie: some_test_foo_bar_cookie_d41d8cd98f00b204e9870998ecf8421e=nl; expires=Sat, 29-Mar-2025 08:44:59 GMT; Max-Age=86400; path=/; secure
Set-Cookie: some_test_foo_bar_cookie_d41d8cd98f00b204e9870998ecf8421e=de; expires=Sat, 29-Mar-2025 08:44:59 GMT; Max-Age=86400; path=/; secure
Set-Cookie: some_test_foo_bar_cookie_d41d8cd98f00b204e9870998ecf8421e=nl; expires=Sat, 29-Mar-2025 08:44:59 GMT; Max-Age=86400; path=/; secure
Set-Cookie: some_test_foo_bar_cookie_d41d8cd98f00b204e9870998ecf8421e=en; expires=Sat, 29-Mar-2025 08:44:59 GMT; Max-Age=86400; path=/; secure
Set-Cookie: some_test_foo_bar_cookie_d41d8cd98f00b204e9870998ecf8421e=nl; expires=Sat, 29-Mar-2025 08:44:59 GMT; Max-Age=86400; path=/; secure
Set-Cookie: some_test_foo_bar_cookie_d41d8cd98f00b204e9870998ecf8421e=es; expires=Sat, 29-Mar-2025 08:44:59 GMT; Max-Age=86400; path=/; secure
Set-Cookie: some_test_foo_bar_cookie_d41d8cd98f00b204e9870998ecf8421e=nl; expires=Sat, 29-Mar-2025 08:44:59 GMT; Max-Age=86400; path=/; secure
Set-Cookie: some_test_foo_bar_cookie_d41d8cd98f00b204e9870998ecf8421e=fi; expires=Sat, 29-Mar-2025 08:44:59 GMT; Max-Age=86400; path=/; secure
Set-Cookie: some_test_foo_bar_cookie_d41d8cd98f00b204e9870998ecf8421e=nl; expires=Sat, 29-Mar-2025 08:44:59 GMT; Max-Age=86400; path=/; secure
Set-Cookie: some_test_foo_bar_cookie_d41d8cd98f00b204e9870998ecf8421e=fr; expires=Sat, 29-Mar-2025 08:44:59 GMT; Max-Age=86400; path=/; secure
Set-Cookie: some_test_foo_bar_cookie_d41d8cd98f00b204e9870998ecf8421e=nl; expires=Sat, 29-Mar-2025 08:44:59 GMT; Max-Age=86400; path=/; secure
Set-Cookie: some_test_foo_bar_cookie_d41d8cd98f00b204e9870998ecf8421e=ie; expires=Sat, 29-Mar-2025 08:44:59 GMT; Max-Age=86400; path=/; secure
Set-Cookie: some_test_foo_bar_cookie_d41d8cd98f00b204e9870998ecf8421e=nl; expires=Sat, 29-Mar-2025 08:44:59 GMT; Max-Age=86400; path=/; secure
Set-Cookie: some_test_foo_bar_cookie_d41d8cd98f00b204e9870998ecf8421e=it; expires=Sat, 29-Mar-2025 08:44:59 GMT; Max-Age=86400; path=/; secure
Set-Cookie: some_test_foo_bar_cookie_d41d8cd98f00b204e9870998ecf8421e=nl; expires=Sat, 29-Mar-2025 08:44:59 GMT; Max-Age=86400; path=/; secure
Set-Cookie: some_test_foo_bar_cookie_d41d8cd98f00b204e9870998ecf8421e=pt; expires=Sat, 29-Mar-2025 08:44:59 GMT; Max-Age=86400; path=/; secure
Set-Cookie: some_test_foo_bar_cookie_d41d8cd98f00b204e9870998ecf8421e=nl; expires=Sat, 29-Mar-2025 08:44:59 GMT; Max-Age=86400; path=/; secure
Set-Cookie: some_test_foo_bar_cookie_d41d8cd98f00b204e9870998ecf8421e=sv; expires=Sat, 29-Mar-2025 08:44:59 GMT; Max-Age=86400; path=/; secure
Set-Cookie: some_test_foo_bar_cookie_d41d8cd98f00b204e9870998ecf8421e=nl; expires=Sat, 29-Mar-2025 08:44:59 GMT; Max-Age=86400; path=/; secure
Allow: GET, POST, PUT, PATCH, DELETE
Expires: Wed, 11 Jan 1984 05:00:00 GMT
Cache-Control: no-cache, must-revalidate, max-age=0, no-store, private
X-Accel-Buffering: no
Content-Encoding: br
Vary: Accept-Encoding

While it's true that in many cases, these duplicate setcookie calls indicate an issue in the application, it's often outside of the scope (= 3rd party dependencies causing it or people not being aware of this issue at all in the first place) of the application.

This change could result in a significant performance gain (= time until the first part of the document is loaded) for all PHP applications on pages where they set cookies.

*in http2/3 they can, but it's practically not supported or implemented anywhere and that's unlikely to change

@iluuu1994
Copy link
Member

setcookie() is really just a convenience wrapper over header(). This approach would require buffering and scanning cookies so that they can be de-duplicated, but why not do that in userland? I understand you're saying:

= 3rd party dependencies causing it or people not being aware of this issue at all in the first place

But is this argument ever valid elsewhere? We can't change language semantics because of buggy 3rd party code.

This change could result in a significant performance gain (= time until the first part of the document is loaded) for all PHP applications on pages where they set cookies.

Not really, only in applications that send the same cookie over and over again. The rest would be slower due to the buffering and scanning existing cookies. And we shouldn't optimize for buggy code.

Anyway, I don't work much on the HTTP part of PHP, maybe @bukka has opinions.

@kkmuffme
Copy link
Author

This approach would require buffering

This is already happening in PHP anyway - headers_list()
Therefore this isn't true:

The rest would be slower due to the buffering

@iluuu1994
Copy link
Member

iluuu1994 commented Mar 28, 2025

@kkmuffme This buffers all headers, but not just cookies. Presumably, we would not want to scan through all headers and especially not parse them just to find out whether some cookie was already set. But yes, maybe buffering wasn't the best choice of words.

@kkmuffme
Copy link
Author

especially not parse them just to find out whether some cookie was already set

In some kind this is happening already when using header() with 2nd argument true (which is the default), so it wouldn't be much more effort to just slightly extend that.

why not do that in userland?

Because it's much slower (= have to parse all headers) and also not easily possible without either causing side-effects or not breaking other things/having a guarantee that it runs bc header_register_callback() is overwriting (= there can be only 1) If I use it to fix this issue in userland, but a package I use also uses that function, then either of our code won't run.

@kkmuffme
Copy link
Author

Essentially what is happening now:
setcookie behaves like
header( 'Set-Cookie: ...', false )
but it should behave kind of like:
header( 'Set-Cookie: ...', true )

except that instead of overwriting the whole Set-Cookie, only if it's essentially the same cookie

@bukka
Copy link
Member

bukka commented Mar 28, 2025

As @iluuu1994 pointed out, this would be a bit more expensive because you would need to extract just name, path and domain and use only that for comparison. So I think we would need a special list for that.

I can see the use case (especially in some CMS environments where the cookie addition happens from multiple plugins / libs) but it should not be a default because for many users that control cookie addition, this is not a problem and we would just add an unnecessary overhead. So I think it should be the same like for header - extra parameter $replace

Also I just read https://datatracker.ietf.org/doc/html/rfc6265#section-5.3 (point 11) and there is a minor semantical difference:

If the cookie store contains a cookie with the same name,
        domain, and path as the newly created cookie:

        1.  Let old-cookie be the existing cookie with the same name,
            domain, and path as the newly created cookie.  (Notice that
            this algorithm maintains the invariant that there is at most
            one such cookie.)

        2.  If the newly created cookie was received from a "non-HTTP"
            API and the old-cookie's http-only-flag is set, abort these
            steps and ignore the newly created cookie entirely.

        3.  Update the creation-time of the newly created cookie to
            match the creation-time of the old-cookie.

        4.  Remove the old-cookie from the cookie store.

As I understand it the part about http-only-flag might mean a slight change - should not be really a problem in reality but it is minor change IMHO. In this I consider that the storage would first store cookie that is first defined and then process the duplicate. With your replace the non http-only would simple overwrite unless we would implent this sort of logic as well. It might be worth to handle it as well though but it would need some testing if browser really behave like specified.

In any case it shows that extra parameter would be probably more convenient. But not sure how helpful this would be because it would require to add the parameter to all calls. Maybe INI would be an option but considering some other discussion I don't think it will get much support.

@bukka
Copy link
Member

bukka commented Mar 28, 2025

I think it will be quite hard to find a clear consensus on this so it might be necessary to go through RFC.

@bukka
Copy link
Member

bukka commented Mar 28, 2025

Or at least have discussion on internals where everyone is fine with the proposed solution

@bukka
Copy link
Member

bukka commented Mar 28, 2025

Personally if we iron out the semantic difference - meaning there won't be any, then I wouldn't have a problem with INI, because it's more perf tweak that works in some case but doesn't work elsewhere. But as I said, it might not be popular.

@kkmuffme
Copy link
Author

I think we would need a special list for that.

Could store the cookie name as a separate list and if same cookie name set again set it to a value of true. Then only if we have duplicate cookies, deduplicate immediately before the headers get sent, bc that way it only needs to happen once (basically what is possible with header_register_callback() in userland, but massive side-effects bc of how header_register_callback() behaves)
The performance impact of that is probably <1ms and also only incurred for requests that actually have duplicate setcookie (with no performance impact on other requests)

But I agree: it's important that the performance impact is negligibly small for unaffected requests.

for many users that control cookie addition, this is not a problem and we would just add an unnecessary overhead

Many users have no idea that they suffer from that problem and misconfigure their servers. If you google upstream sent too big header while reading response header from upstream you only find results like increasing fastcgi_buffer_size 32k; which does fix the issue but is generally misguided

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants