-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add commands to collect and retrieve response bodies #877
base: main
Are you sure you want to change the base?
Conversation
Thanks for the PR. I think we do not have clear requirements that any clients need the functionality provided by addBodyCollector so we could exclude it for now (unless someone needs it?). At least I would not add browsing contexts params in the same way as we have it in event subscriptions (when context id resolves to the top-level traversable). I think we need an ability to define the overall size limit instead of (in addition?) a per-request limit in setBodyCollectorConfiguration (instead of just not saving the freshest request we should probably evict earlier requests). |
I am thinking if in my initial draft I should have started collection in responseStarted (I think that would actually be required for interception use cases?) |
@@ -5264,6 +5275,9 @@ given |navigable| and |navigation status|: | |||
|
|||
1. [=Resume=] with "<code>navigation committed</code>", |navigation id|, and |navigation status|. | |||
|
|||
1. For each |session| in [=active BiDi sessions=], [=delete collected response bodies=] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
by this point I believe the navigation request that loaded the document has already happened and we want to retain it. If we really want to follow the CDP model we should key the network data by document.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the response already completed by that time? In any case, adding a reference to the document sounds fine to me I almost wanted to include it in the initial design.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the headers are read and the body starts being read in parallel. Not having our network hooks in the fetch spec makes it a bit more difficult to cross-check but I think using document's navigation ID would be more resilient (especially if we might be moving the collection to various hooks).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm having trouble with this part so far.
AFAICT Document's navigation id get set back to null once the load is done (step 9.7 of https://html.spec.whatwg.org/#the-end:concept-document-navigation-id). This means that we won't be able to store the navigation id for all requests.
Storing the document itself is not great either, because the document is normally only created after the response for the document's URL started arriving.
My current thinking is to store the navigable's ongoing navigation (which is reused as the Document's during-load navigation id), and on navigation committed clear all responses which either:
- don't have any navigation id
- have another navigation id than the navigables ongoing navigation
It might work, but it feels a bit flaky.
Thanks for taking a look!
I'll wait for feedback from James here, in case that doesn't align with his feedback from PR #856 , but I thought that was one of the main required changes? Having a way to clearly declare whether you want to record responses or not. And if we do I think it makes sense to make it consistent with all our other similar APIs (events and intercepts) (note: intercepts don't have user context support yet, but they really should).
Yeah I'm happy to update the configuration bit with a total size + FIFO approach to evict requests, let's see if there are any other requested flags/limits.
Maybe we should create the entry as early as beforeRequestSent, and have a "state" in the collected response (pending, available, evicted ...) |
One thing I wanted to mention re: contexts/userContexts in On our side, considering our current implementation, it is important to have an API where clients can be selective upfront about which requests they are interested in. To record responses, Firefox duplicates them in another (parent) process. Means it's easier for us to control the availability of responses, but we probably use up more memory than Chrome does. On the client size, if you are only interested in one class of requests coming from a specific tab, if you can't define the contexts userContexts to watch, then you have to fiddle with the "total size" configuration hoping that the requests you are interested in are not going to be evicted first? Puppeteer and other clients can still just call it without any argument in the beginning? But considering this API is consistent with our subscription and intercept APIs, and seems beneficial for clients, I would still like us to consider it. |
I agree with Julian. Given that this feature has potentially high overhead in terms of resource usage it seems important to be able to turn it on and off with more granularity than just "did anyone subscribe to any network events and so could have a response id to allow reading the body" (even that optimization is a little hard because in theory as a pure optimisation one would need to keep the data around until the I also agree that if we're adding a maximum total cache size it's even more important that you can be specific about which response bodies you're interested in. My assumption is that test automation clients currently don't offer this kind of control because they have been based around CDP's semantics. It seems reasonable to me to assume tests generally know when they will want to access the network bodies and so an API based on opt-in is reasonable. We also know that devtools, for example, do collect network response bodies one tab at a time, and that that kind of use case would be severely compromised if there was only global control over retention (e.g. if I'm trying to record an interactive browsing session in one tab for later replay it's extremely important to me that everything in that tab ends up being accessible, and I actively don't want anything that happens in other tabs to affect it). |
CDP does offer max size control per target (local subtree of frames). My point was mostly about limiting by URL patterns or max response body size not being seemingly too useful: most clients want everything in a tab even if they know what URL patterns or individual response sizes they are dealing with. I think the current API very easily allows scenarios like "oh I set max size per response body to 10KB so my 10.1KB response was not recorded and I need to re-run everything or I have not realized that I needed bodies for these URLs". Most users would just set something like As for URL pattern matching, matching by media type even sounds more useful as the first filter. Do we have any specific clients interested in the fine-grain controls beyond per-navigable total limits? If not, I would propose to simplify the proposal by adding context ids to network.SetBodyCollectorConfiguration making it per context or global and changing maximumBodySize to maximumTotalBodySize (I believe most clients would just be using that and we can reduce the amount of specification and implementation needed without blocking an extension with fine-grained filtering in the future). That would require partitioning per navigable in the cache (or even per document) but it looks like we would need it for cleanup as well (if the we agree on the current cleanup points). |
I agree that URL patterns could be dropped in the first pass, as long as we keep contexts and user contexts. My concern with just having a maximum total size, and no other filtering, is the case where you have a page with a few large assets that you're not interested in, but which might cause cache eviction of the small assets you are interested in. For example on a media viewing page where you might have some pictures or videos that are hundreds of megabytes, when your test is entirely concerned with checking some data in a few kB of HTML or JSON. Without a URL (or MIME type) filter we can't easily avoid the overhead of copying that data to the parent process (at least up to the size limit), but we can avoid requiring people to set the maximum cache size to the size of all the resources on the page rather than a per-resource limit of (say) 100kB. |
Thanks for the feedback! Trying to summarize where we are:
My comments on this: 1/ For the URL patterns, I agree we can drop, but from our discussion it sounds like we want some way to exclude requests instead. Would 2/ For maximumBodySize: I imagined this should be used to set a reasonably high (few MBs) limit to individual requests to avoid having the whole storage for response bodies taken up by just a few random requests (as mentioned by @jgraham ). In Firefox DevTools we have a cap for individual responses to avoid storing unreasonably large responses (1MB by default, can be changed with a pref). I think it's worth having an explicit limit, but maybe it should have a default value. And maybe that should rather be a capability. I would like to keep a clearly defined limit and allow clients to override it if needed. On Firefox side I don't think we can handle duplicating huge responses in the parent process for BiDi, we will have to implement a cap anyway. 3/ API: only add setBodyCollectorConfiguration (or another name :) ) I imagine the behaviour would be close to
While it does simplify the API, it feels like a step back closer to what we previously had for subscriptions. A model where we create unique collectors that can each be removed on their own feels less surprising? @OrKoN Let me know what you think, maybe you have suggestions on how a single |
Thanks for summarizing. I am still not sure if we have a client with a use case right now for limiting response storage based on specific attributes of the request/response. I see that Playwright's model for Firefox is also based on the total size with eviction (I could not tell if it is per navigable or global?). Therefore, I think it would be a reasonable model to say that as a client you get last X bytes of response data stored per navigable that you enabled the collection for? Eventually if there are users requesting fine-grained control on the per-request basis, it could be added on top of that model. As for how the configuration command should work I would say, unlike event subscriptions, we could make it so that the last command always win.
basically, at any time the session, each user context, each browsing context have a maxCacheSizePerNavigable value that is either a result of a configuration call or inherited from the "parent" object if a navigable/user context is newly created. So to stop collecting any responses the client could send So it sounds to be that
I wonder if you would still need to duplicate the responses if the removal happens at the points proposed in this PR (responses do not outlive the navigable)? |
Right, we can set the total size to 0. It feels a bit like a workaround? Maybe having a clean command to really stop collecting bodies wouldn't hurt? About making the max size a per navigable configuration. It makes it easier to work with multiple calls to setConfiguration.
In that case, by default navigables have 1000 allowed cache size, the ones in "foo" have 5000 and navigable 12 has 10000. It does mean that there's no effective max cache size anymore though. User may create new navigables and fill the browser memory. I imagine that's not a concern in practice, but it's important to note that we can't keep this under control with this approach. At least in this model we don't have to wonder if a request takes up space in the cache configured for its navigable or globally, the cache size is always allocated per navigable, and that seems nice.
Not really, it would require too many changes to our network events monitoring, which is almost entirely handled in the parent process for devtools/bidi. Also we should keep the door open to relax those limitations in the future, it would be great if responses could only be evicted when the top level traversable navigates / or is destroyed. Which means I would still like to keep this configurable. Worst case scenario this could be driven by a preference + NOTE that implementations might truncate long response bodies, but I would really prefer having something consistent across browsers here. |
Sidenote: I notice that CDP supports maxTotalBufferSize/maxResourceBufferSize, so unless I'm mistaken you already should have support for a per resource limit on CDP side? |
indeed, in Puppeteer we have not used it so far though. In issues where people want increased limits they usually set it as high as the total size available so I am not sure how useful it is to guess how large individual responses could be. |
2e7ed5e
to
cde8750
Compare
Before reviewing the PR in details - I'm sure there are still syntax mistakes not worth fixing for now - let's summarize the current state, and get feedback on the overall approach. Session changes:
New command:
Updates to existing events:
|
403f6cb
to
a677a1c
Compare
…r navigable, check navigation id to evict responses
a677a1c
to
bec5f96
Compare
@OrKoN In the last update I tried to simplify the API to only keep one method as suggested. While this works, I'm not sure this is really a good decision at the spec level. It's functionally very close to what we had with the previous proposal, but is less flexible and more sensitive to the order in which commands are called. With the previous approach we have something that can naturally evolve to support url patterns and more fine grained configurations. If libraries such as puppeteer only prefer to expose it as a simplified API, it should still be possible. Could we reconsider ? |
I think in the previous proposal there was also a configuration method for limits and an additional per URL configuration methods. Could you please clarify how the current proposal would limit the addition of the per URL configuration methods? |
I agree with @juliandescottes here; I feel like in this proposal the obvious things that a user might want to do (enable/disable collecting response bodies for some tab or user context) are exposed as side effects of configuring low-level details (cache sizes). I do think we need that level of configuration, but I'd prefer an API where the methods correspond to user intent, and where we can have reasonable defaults for the various tuning parameters. |
In the previous approach you had one method to set a global configuration (only resource max size, but can easily add total max size as well). Then add/removeBodyCollector was used to select in which contexts/userContexts user wanted to collect bodies, with an optional urlPattern (which can still be dropped in a first iteration).
I find the current proposal harder to understand as is. You need to be aware that the order in which you call the command is important, and you might erase configurations unexpectedly. But it still remains relatively easy to predict how it's going to work without reading the spec. Now if we add url patterns, there are a few things to answer. If context 12 is listening www.a.com, and I want to also listen to www.b.com, how can I do it? When we set the configuration again for this context, does it add to the existing pattern? Does it override it? Then if we imagine we catch all requests globally with cache size of 1000. And for context 12, we only cache requests to JS, but with a cache size of 2000. If there's a request in context 12 which is not JS and doesn't match, then does it still get captured because we capture all requests globally? If then which cache size should be used? We can answer to all those questions in the spec, but I'm still concerned it will make the behavior unexpected, whereas the API where you add and remove collectors is very simple to understand. |
I can see a concern but I am not sure it's worse than the behavior of network.setCacheBehavior. I'd say the latest version aligns more with network.setCacheBehavior.
should |
if these settings are not part of the add/removeBodyCollector methods, then changing this limits via the global configuration is similar to this proposal in the sense that it would remove/add things from the cache otherwise handled by the body collector. |
would changing the current version's configuration to accept a |
So, I think an API like:
where However as has been discussed, the limitation of this kind of design is that the only sensible update behaviour is an overwrite; basically each top level traversable gets a single response body configuration that's inherited from its user context at creation, and subsequent commands overwrite the configuration (purging the cache as necessary to match). For In this case it's a WebDriver-layer feature, so arguably there's value in looking at how we've designed other similar WebDriver features. In particular it seems very similar to
That does match the way that we've tried to vend handles to things rather than just mutate global state, so that code can only remove what it created. For example a test fixture would be able to enable/disable collecting response bodies without having to worry about whether it had also been enabled by some other component. That significantly simplifies client code. I think there's a complex version of this where we try to only provide a body if it matched the limits of a certain intercept i.e. in FWIW, when I started writing this I was thinking that |
So if multiple AddBodyCollector are made to the objects (contexts) which one applies? I think with events and intercepts the situation is different because no matter how many you have the result is a boolean outcome (is event emitted, is request intercepted). I think AddBodyCollector is fine as long as the resolution of the effective configuration happens at the AddBodyCollector call time and does not need to be re-computed per request. WDYT? |
Perhaps a solution could be that if there are any added body collectors that the request matches, then the body is retained but I am worried that the network overhead would grow linearly with the size of those collectors. |
My initial proposal was the same, a body collector would just tell you if yes or no the body should be persisted. In James' proposal there is a slight change to also make the max resource size part of this configuration (but not the total size). I think the suggestion on how to handle this in case several are matching is to pickup the max value from all matching collectors:
Personally I would slightly prefer to have all those size settings as global , but that can work too.
Yeah I think this matches what James was suggesting. I don't quite understand how this would lead to more overhead than another proposal though? |
That's what I'm proposing
I think you can always compute the current configuration at add/removal time, so for sizes only it's just a fixed overhead. If we had other parameters like URL patterns or media types then there might be a linear overhead in the number of those (for media types it could be O(1) but not for URL patterns), but you could still compute the current configuration when a collector was added or removed, but keep the behaviour O(1) in the number of collectors (but not number of URL patterns) for each request. In practice I'm not sure why the number of collectors would grow large. I can see N=2 that apply to the same request being common (a fixture configures something for a specific user context, and then a test configures something for a specific browsing context), but the chance of something much higher seems small. |
it would require synchronizing the lists with the process where the request happens and doing lookup per request. Chromium's implementation is out of process and with current CDP capabilities it would mean the performance of the request interception. |
There is synchronization, but I think you could synchronize the computed per-top-level-traversable state rather than lists? e.g.
|
IOW the parent process owns the lists, the content/network processes only see the resolved values. |
I think addBodyCollector with limits being computed at the update time would work (and it does not change the deletion and collection logic much). A global configuration method would not be needed then? I am not sure if it makes it more difficult to add URL patterns later since different patterns could have (partially) conflicting limits but perhaps the URL pattern configuration should not be allowed to change any limits.
I agree that N > 2 would be uncommon in practice, that's why I think the current proposal would also work for the examples like testharness vs test code.
|
Overview of what this PR aims to add:
New BiDi session items:
New commands:
New error:
Updates to existing commands
Note that I haven't added extra limitations to which responses are collected in responseCompleted, but we can definitely add them (eg no worker requests etc...)
Preview | Diff