Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kestrel on linux logs a first chance exception with every EC2 health check request #61081

Open
1 task done
gfody opened this issue Mar 21, 2025 · 4 comments
Open
1 task done
Labels
area-networking Includes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions

Comments

@gfody
Copy link

gfody commented Mar 21, 2025

Is there an existing issue for this?

  • I have searched the existing issues

Describe the bug

on linux we see a first chance exception with every https health test request from ec2

Expected Behavior

the health check requests do not cause exceptions. it looks like the amount of exceptions is causing increased GC heap size and increased CPU and process memory usage.

Steps To Reproduce

a simple webapp that logs first chance exceptions, eg:

var builder = WebApplication.CreateBuilder(args);
builder.Services.AddSystemd();
builder.Services.AddWindowsService();
builder.WebHost.ConfigureKestrel(
    kso => kso.ConfigureHttpsDefaults(
        o => o.ServerCertificate = my_ssl_cert))
            .UseUrls("https://+:443");

AppDomain.CurrentDomain.FirstChanceException += (o, e) =>
    logger.Warning(e.Exception, e.Exception.Message);

builder.Build().Run();

deploy as systemd service, eg:

if (!(gcm dotnet -ea silent) -or !(dotnet --list-runtimes | grep 'AspNetCore.App 9.0')) {
  curl -sSL https://dot.net/v1/dotnet-install.sh | bash /dev/stdin --channel 9.0 --runtime aspnetcore --install-dir /usr/lib64/dotnet
  if (!(gcm dotnet -ea silent)) { ln -s /usr/lib64/dotnet/dotnet /usr/bin/dotnet -f }
}

New-Item -Force -Path /etc/systemd/system/test-site.service -ItemType file -Value @"
[Unit]
Description=Test Site

[Service]
WorkingDirectory=/path/to/site
ExecStart=/usr/bin/dotnet TestSite.dll
Restart=always

[Install]
WantedBy=multi-user.target
"@

systemctl daemon-reload
systemctl restart test-site
systemctl enable test-site

add the instance to a TLS target group in EC2, the http requests from the health checks trigger these exceptions several times a second.

Exceptions (if any)

System.IO.IOException: Received an unexpected EOF or 0 bytes from the transport stream.
   at System.Net.Security.SslStream.ReceiveHandshakeFrameAsync[TIOAdapter](CancellationToken cancellationToken)
   at System.Net.Security.SslStream.ForceAuthenticationAsync[TIOAdapter](Boolean receiveFirst, Byte[] reAuthenticationData, CancellationToken cancellationToken)
   at Microsoft.AspNetCore.Server.Kestrel.Https.Internal.HttpsConnectionMiddleware.OnConnectionAsync(ConnectionContext context)

.NET Version

9.0.3

Anything else?

the environment is AWS EC2, we've tested with Amazon Linux 2023 and Ubuntu 24.04, and every version of dotnet from 8.0.0 to 9.0.3

the exceptions don't appear on Windows (tested server 2022 and 2025)

output from dotnet --info:

Host:
  Version:      9.0.3
  Architecture: arm64
  Commit:       831d23e561
  RID:          linux-arm64

.NET SDKs installed:
  No SDKs were found.

.NET runtimes installed:
  Microsoft.AspNetCore.App 9.0.3 [/usr/lib64/dotnet/shared/Microsoft.AspNetCore.App]
  Microsoft.NETCore.App 9.0.3 [/usr/lib64/dotnet/shared/Microsoft.NETCore.App]
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-networking Includes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions label Mar 21, 2025
@gfody gfody changed the title Kestrel on linux logs a first chance exception with every https request Kestrel on linux logs a first chance exception with every EC2 health check request Mar 21, 2025
@adityamandaleeka
Copy link
Member

That exception seems to indicate that the EC2 health checks might be initiating a TLS connection but closing it abruptly (maybe even before the handshake completes?). This could just be due to how they implemented these health checks.

Getting a packet capture would help confirm the exact details here since you're just interested in the handshake and connection flow.

cc @MihaZupan @wfurt

@wfurt
Copy link
Member

wfurt commented Mar 23, 2025

I have also seen this when there is mismatch between HTTP and HTTPS. Packet capture os some other form of tracing would be useful

@gfody
Copy link
Author

gfody commented Mar 29, 2025

here's a tcpdump of all the traffic between the lb and one of the target instances while the exceptions were happening. towards the end I switched the health check from HTTPS to TCP and confirmed that the exceptions stopped happening.

61081.pcap.gz

@adityamandaleeka
Copy link
Member

Yea based on that pcap it looks like the theory above is correct; there are TCP connections being established and then immediately closed. I can see a SYN, SYN-ACK, ACK, then an immediate FIN from the client without any data being sent over the connection (e.g. client hello or any HTTP data).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-networking Includes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions
Projects
None yet
Development

No branches or pull requests

3 participants