Skip to content

Add connection timeout to TLS Channel #1686

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

vbabanin
Copy link
Member

@vbabanin vbabanin commented Apr 25, 2025

@vbabanin vbabanin self-assigned this Apr 25, 2025
@katcharov katcharov requested review from jyemin and stIncMale April 25, 2025 18:59

SSLParameters sslParameters = sslEngine.getSSLParameters();
enableSni(getServerAddress().getHost(), sslParameters);
group.getTimeoutExecutor().schedule(() -> {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For better scalability, I believe we should use a dedicated timeout executor for connection timeouts.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't group.getTimeoutExecutor() dedicated to timeouts?


SSLParameters sslParameters = sslEngine.getSSLParameters();
enableSni(getServerAddress().getHost(), sslParameters);
group.getTimeoutExecutor().schedule(() -> {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

//TODO connectTimeoutMs being 0 - should not schedule timeout interruption.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 6b838af.

@@ -121,23 +144,29 @@ private Pair(final SocketChannel socketChannel, final Runnable attachment) {
}
}

// Monitors OP_CONNECT events.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this rather be documented on SelectorMonitor via a documentation comment, as opposed to being a free-floating code comment?

enum ConnectionRegistrationState {
CONNECTING,
CONNECTED,
TIMEOUT_OUT
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIMEOUT_OUT -> TIMED_OUT?


private Pair(final SocketChannel socketChannel, final Runnable attachment) {
enum ConnectionRegistrationState {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ConnectionRegistrationState is not used outside of SocketRegistration.

Suggested change
enum ConnectionRegistrationState {
private enum ConnectionRegistrationState {

SSLEngine sslEngine = getSslContext().createSSLEngine(getServerAddress().getHost(),
getServerAddress().getPort());
sslEngine.setUseClientMode(true);
int connectTimeoutMs = getSettings().getConnectTimeout(TimeUnit.MILLISECONDS);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both SocketStream and NettyStream use TimeoutContext.getConnectTimeoutMs. It seems we should use the same in TlsChannelStream.

int connectTimeoutMs = operationContext.getTimeoutContext().getConnectTimeoutMs();

Furthermore, TimeoutContext.getConnectTimeoutMs requires us to handle MongoOperationTimeoutException, provided that we plan to throw InterruptedByTimeoutException instead of it, as the closeAndTimeout method does.

Copy link
Member Author

@vbabanin vbabanin Apr 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! I’ve updated it to use TimeoutContext.getConnectTimeoutMs.

Regarding the handling - both SocketStream and NettyStream propagate the exception down the stack without catching it in stream.openAsync. InternalStreamConnection.openAsync wraps stream.openAsync in a try/catch and handles MongoOperationTimeoutException there:

} catch (Throwable t) {
close();
callback.onResult(null, t);
}

Generally, I think that this exception should be handled in AsyncCompletitionHandler rather then synchronous try/catch around stream.openAsync (currently some are thrown from stream.openAsync and some are propagated to a handler) - we could reconsider this behavior as part of the async refactoring epic.

NettyStream and SocketStream throw MongoOperationTimeoutException as is, because "connection process" has not started yet. I moved operationContext.getTimeoutContext().getConnectTimeoutMs(); before socketChannel.connect(getSocketAddresses(getServerAddress(), inetAddressResolver).get(0)); to align with the same behaviour. In TlsChannelStream case, MongoOperationTimeoutException is propagated to the handler. Let me know what you think.

this.connectionRegistrationState = new AtomicReference<>(ConnectionRegistrationState.CONNECTING);
}

public boolean markConnectionEstablishmentTimedOut() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
public boolean markConnectionEstablishmentTimedOut() {
boolean markConnectionEstablishmentTimedOut() {

ConnectionRegistrationState.TIMEOUT_OUT);
}

public boolean markConnectionEstablishmentCompleted() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
public boolean markConnectionEstablishmentCompleted() {
boolean markConnectionEstablishmentCompleted() {

enableSni(getServerAddress().getHost(), sslParameters);
private void scheduleTimeoutInterruption(final AsyncCompletionHandler<Void> handler,
final SelectorMonitor.SocketRegistration socketRegistration,
final SocketChannel socketChannel,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method should not accept socketChannel because socketRegistration already contains it.

//TODO refactor ths draft
InterruptedByTimeoutException timeoutException = new InterruptedByTimeoutException();
try {
socketChannel.close();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the only place where we close the SocketChannel opened in TlsChannelStream.openAsync. Shouldn't we close all the SocketChannels registered with SelectorMonitor, and do it in a way that makes sure than none of them stays open due to a race condition?

Comment on lines 110 to 114
enum ConnectionRegistrationState {
CONNECTING,
CONNECTED,
TIMEOUT_OUT
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I propose a change that does not involve introducing ConnectionRegistrationState, and instead of boolean markConnectionEstablishmentTimedOut/boolean markConnectionEstablishmentCompleted methods exposes boolean tryCancel/void runIfNotCancelled on SocketRegistration: stIncMale@f4e9f37.

What do you think?

P.S. I made the change blindfolded, without running any tests.

@stIncMale
Copy link
Member

The last reviewed commit is 238268c.

vbabanin and others added 2 commits April 25, 2025 19:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants