docs: reflect the recent SessionPool changes in a guide#3724
Conversation
janbuchar
left a comment
There was a problem hiding this comment.
Many of the comments are not blocking — just stuff that popped into my head when reading this.
| These are the basics of configuring SessionPool. | ||
| Please, bear in mind that a Session pool needs time to find working IPs and build up the pool, | ||
| so we will probably see a lot of errors until it becomes stabilized. | ||
| These are the basics of configuring the session pool. Bear in mind that the pool needs time to find working IPs and build itself up, so you will probably see a number of errors until it stabilizes. |
There was a problem hiding this comment.
Is this actually true? Also, it fits better in the avoid_blocking guide IMO.
|
|
||
| You influence this with three methods on the session. <ApiLink to="core/class/Session#markGood">`markGood()`</ApiLink> records a successful use — it increments the usage count and heals the error score a little (by `errorScoreDecrement`, default `0.5`). <ApiLink to="core/class/Session#markBad">`markBad()`</ApiLink> records a failure that *might* be the session's fault and *might* just be bad luck — it raises the error score by one, so a session needs to fail repeatedly before it is dropped. <ApiLink to="core/class/Session#retire">`retire()`</ApiLink> drops the session immediately and permanently; this is what you call when you are certain the identity itself is burnt (for example, a `403` response). | ||
|
|
||
| The distinction between `markBad()` and `retire()` matters. Use `markBad()` for transient, external problems such as a timeout or a `5XX` response — the IP is probably fine and a couple of retries should not throw it away. Use `retire()` for problems that prove the session is blocked, where reusing it is pointless. Note that in v4 retirement is terminal: once a session is retired, a later `markGood()` will not bring it back. |
There was a problem hiding this comment.
After we rename markBad and markGood, this paragraph will become obsolete.
There was a problem hiding this comment.
That is, if we manage to come up with anything better. Even then, imo practice (i.e., repeating this in docs) makes perfect.
e.g. Session.recordFailure and Session.recordSuccess imo breaks the idea of Session as a simple data struct and hints at some deeper tallying logic. Some other ideas like .succeeded() and .failed() don't really follow our naming conventions for methods.
Let's discuss this further under #3663, but perhaps we don't need this breaking change after all (it started as our hunch anyway, not from an outside user).
| await sessionPool.addSession({ id: 'cheap', proxyInfo: proxyInfoFromUrl('http://cheap-proxy.com') }); | ||
| await sessionPool.addSession({ id: 'premium', proxyInfo: proxyInfoFromUrl('http://expensive-proxy.com') }); |
There was a problem hiding this comment.
And what if I want multiple cheap sessions for example, so that I can alternate between them? 🙂
There was a problem hiding this comment.
It's possible, by creating multiple cheap-[xyz] sessions, keeping track of these ids and reassigning the request.sessionId with a getCheapSession() helper (that cycles between the cheap session ids).
I'd argue that we should describe the main principles in the docs (assigning sessions to requests), but any more complex usage of these should be left as an excercise for the reader, as it doesn't really add that much information. Wdyt?
Adds recent changes to the
SessionPooland related classes to the existing guides.Closes #796