Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

低并发导致 Tomcat 假死 #3534

Closed
toint-admin opened this issue Mar 24, 2025 · 14 comments · Fixed by #3540
Closed

低并发导致 Tomcat 假死 #3534

toint-admin opened this issue Mar 24, 2025 · 14 comments · Fixed by #3540

Comments

@toint-admin
Copy link

toint-admin commented Mar 24, 2025

环境: linux jdk21
weixin-java-cp: 4.7.0
框架配置: 一切都是默认配置, 没动过

下午五点多, 大概接收到十几个下发公众号模板消息的请求, 然后就导致 Tomcat 假死了, 没有任何异常日志.

然后我重启了服务器, 继续并发十几个请求去测试, 依旧假死.

切换到 okhttp 就正常了...


  1. 客户端: httpclient
  2. connectionRequestTimeout = -1
  3. connectTimeout=5000
  4. socketTimeout=5000

猜测: 为什么确实是存在 connectTimeout=5000socketTimeout=5000 的, 但是请求超时没有报错结束掉任务?connectionRequestTimeout 应该是从线程池获取执行线程的等待时间, 这个会一直等待, 所以导致线程被一直阻塞没有释放.

请各位大佬看看, 有没有存在类似问题的? 排查了个把月才今天运气好复现了.

hprof 文件

ID = 57 的线程 'ForkJoinPool-1-worker-1'
	jdk.internal.misc.Unsafe.park(Unsafe.java)
	java.lang.VirtualThread.parkOnCarrierThread(VirtualThread.java:675)
	java.lang.VirtualThread.park(VirtualThread.java:607)
	java.lang.System$2.parkVirtualThread(System.java:2643)
	jdk.internal.misc.VirtualThreads.park(VirtualThreads.java:54)
	java.util.concurrent.locks.LockSupport.park(LockSupport.java:369)
	java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionNode.block(AbstractQueuedSynchronizer.java:519)
	java.util.concurrent.ForkJoinPool.unmanagedBlock(ForkJoinPool.java:3780)
	java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3725)
	java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1712)
	org.apache.http.pool.AbstractConnPool.getPoolEntryBlocking(AbstractConnPool.java:391)
	org.apache.http.pool.AbstractConnPool.access$300(AbstractConnPool.java:70)
	org.apache.http.pool.AbstractConnPool$2.get(AbstractConnPool.java:253)
	org.apache.http.pool.AbstractConnPool$2.get(AbstractConnPool.java:198)
	org.apache.http.impl.conn.PoolingHttpClientConnectionManager.leaseConnection(PoolingHttpClientConnectionManager.java:306)
	org.apache.http.impl.conn.PoolingHttpClientConnectionManager$1.get(PoolingHttpClientConnectionManager.java:282)
	org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:190)
	org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186)
	org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
	org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
	org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
	org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
	org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:108)
	me.chanjar.weixin.common.util.http.apache.ApacheSimplePostRequestExecutor.execute(ApacheSimplePostRequestExecutor.java:42)
	me.chanjar.weixin.common.util.http.apache.ApacheSimplePostRequestExecutor.execute(ApacheSimplePostRequestExecutor.java:23)
	me.chanjar.weixin.mp.api.impl.BaseWxMpServiceImpl.executeInternal(BaseWxMpServiceImpl.java:475)
	jdk.internal.vm.Continuation.enterSpecial(Continuation.java)
	jdk.internal.vm.Continuation.run(Continuation.java:251)
	java.lang.VirtualThread.runContinuation(VirtualThread.java:245)
	java.lang.VirtualThread$$Lambda+0x000079d7088f4b58.run(Native method)
	java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1423)
	java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:387)
	java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1312)
	java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1843)
	java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1808)
	java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:188)

ID = 58 的线程 'ForkJoinPool-1-worker-2'
	jdk.internal.misc.Unsafe.park(Unsafe.java)
	java.lang.VirtualThread.parkOnCarrierThread(VirtualThread.java:675)
	java.lang.VirtualThread.park(VirtualThread.java:607)
	java.lang.System$2.parkVirtualThread(System.java:2643)
	jdk.internal.misc.VirtualThreads.park(VirtualThreads.java:54)
	java.util.concurrent.locks.LockSupport.park(LockSupport.java:369)
	java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionNode.block(AbstractQueuedSynchronizer.java:519)
	java.util.concurrent.ForkJoinPool.unmanagedBlock(ForkJoinPool.java:3780)
	java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3725)
	java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1712)
	org.apache.http.pool.AbstractConnPool.getPoolEntryBlocking(AbstractConnPool.java:391)
	org.apache.http.pool.AbstractConnPool.access$300(AbstractConnPool.java:70)
	org.apache.http.pool.AbstractConnPool$2.get(AbstractConnPool.java:253)
	org.apache.http.pool.AbstractConnPool$2.get(AbstractConnPool.java:198)
	org.apache.http.impl.conn.PoolingHttpClientConnectionManager.leaseConnection(PoolingHttpClientConnectionManager.java:306)
	org.apache.http.impl.conn.PoolingHttpClientConnectionManager$1.get(PoolingHttpClientConnectionManager.java:282)
	org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:190)
	org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186)
	org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
	org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
	org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
	org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
	org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:108)
	me.chanjar.weixin.common.util.http.apache.ApacheSimplePostRequestExecutor.execute(ApacheSimplePostRequestExecutor.java:42)
	me.chanjar.weixin.common.util.http.apache.ApacheSimplePostRequestExecutor.execute(ApacheSimplePostRequestExecutor.java:23)
	me.chanjar.weixin.mp.api.impl.BaseWxMpServiceImpl.executeInternal(BaseWxMpServiceImpl.java:475)
	jdk.internal.vm.Continuation.enterSpecial(Continuation.java)
	jdk.internal.vm.Continuation.run(Continuation.java:251)
	java.lang.VirtualThread.runContinuation(VirtualThread.java:245)
	java.lang.VirtualThread$$Lambda+0x000079d7088f4b58.run(Native method)
	java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1423)
	java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:387)
	java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1312)
	java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1843)
	java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1808)
	java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:188)

ID = 59 的线程 'ForkJoinPool-1-worker-3'
	jdk.internal.misc.Unsafe.park(Unsafe.java)
	java.lang.VirtualThread.parkOnCarrierThread(VirtualThread.java:675)
	java.lang.VirtualThread.park(VirtualThread.java:607)
	java.lang.System$2.parkVirtualThread(System.java:2643)
	jdk.internal.misc.VirtualThreads.park(VirtualThreads.java:54)
	java.util.concurrent.locks.LockSupport.park(LockSupport.java:369)
	java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionNode.block(AbstractQueuedSynchronizer.java:519)
	java.util.concurrent.ForkJoinPool.unmanagedBlock(ForkJoinPool.java:3780)
	java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3725)
	java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1712)
	org.apache.http.pool.AbstractConnPool.getPoolEntryBlocking(AbstractConnPool.java:391)
	org.apache.http.pool.AbstractConnPool.access$300(AbstractConnPool.java:70)
	org.apache.http.pool.AbstractConnPool$2.get(AbstractConnPool.java:253)
	org.apache.http.pool.AbstractConnPool$2.get(AbstractConnPool.java:198)
	org.apache.http.impl.conn.PoolingHttpClientConnectionManager.leaseConnection(PoolingHttpClientConnectionManager.java:306)
	org.apache.http.impl.conn.PoolingHttpClientConnectionManager$1.get(PoolingHttpClientConnectionManager.java:282)
	org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:190)
	org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186)
	org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
	org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
	org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
	org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
	org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:108)
	me.chanjar.weixin.common.util.http.apache.ApacheSimplePostRequestExecutor.execute(ApacheSimplePostRequestExecutor.java:42)
	me.chanjar.weixin.common.util.http.apache.ApacheSimplePostRequestExecutor.execute(ApacheSimplePostRequestExecutor.java:23)
	me.chanjar.weixin.mp.api.impl.BaseWxMpServiceImpl.executeInternal(BaseWxMpServiceImpl.java:475)
	jdk.internal.vm.Continuation.enterSpecial(Continuation.java)
	jdk.internal.vm.Continuation.run(Continuation.java:251)
	java.lang.VirtualThread.runContinuation(VirtualThread.java:245)
	java.lang.VirtualThread$$Lambda+0x000079d7088f4b58.run(Native method)
	java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1423)
	java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:387)
	java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1312)
	java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1843)
	java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1808)
	java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:188)

ID = 60 的线程 'ForkJoinPool-1-worker-4'
	jdk.internal.misc.Unsafe.park(Unsafe.java)
	java.lang.VirtualThread.parkOnCarrierThread(VirtualThread.java:675)
	java.lang.VirtualThread.park(VirtualThread.java:607)
	java.lang.System$2.parkVirtualThread(System.java:2643)
	jdk.internal.misc.VirtualThreads.park(VirtualThreads.java:54)
	java.util.concurrent.locks.LockSupport.park(LockSupport.java:369)
	java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionNode.block(AbstractQueuedSynchronizer.java:519)
	java.util.concurrent.ForkJoinPool.unmanagedBlock(ForkJoinPool.java:3780)
	java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3725)
	java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1712)
	org.apache.http.pool.AbstractConnPool.getPoolEntryBlocking(AbstractConnPool.java:391)
	org.apache.http.pool.AbstractConnPool.access$300(AbstractConnPool.java:70)
	org.apache.http.pool.AbstractConnPool$2.get(AbstractConnPool.java:253)
	org.apache.http.pool.AbstractConnPool$2.get(AbstractConnPool.java:198)
	org.apache.http.impl.conn.PoolingHttpClientConnectionManager.leaseConnection(PoolingHttpClientConnectionManager.java:306)
	org.apache.http.impl.conn.PoolingHttpClientConnectionManager$1.get(PoolingHttpClientConnectionManager.java:282)
	org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:190)
	org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186)
	org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
	org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
	org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
	org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
	org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:108)
	me.chanjar.weixin.common.util.http.apache.ApacheSimplePostRequestExecutor.execute(ApacheSimplePostRequestExecutor.java:42)
	me.chanjar.weixin.common.util.http.apache.ApacheSimplePostRequestExecutor.execute(ApacheSimplePostRequestExecutor.java:23)
	me.chanjar.weixin.mp.api.impl.BaseWxMpServiceImpl.executeInternal(BaseWxMpServiceImpl.java:475)
	jdk.internal.vm.Continuation.enterSpecial(Continuation.java)
	jdk.internal.vm.Continuation.run(Continuation.java:251)
	java.lang.VirtualThread.runContinuation(VirtualThread.java:245)
	java.lang.VirtualThread$$Lambda+0x000079d7088f4b58.run(Native method)
	java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1423)
	java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:387)
	java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1312)
	java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1843)
	java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1808)
	java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:188)
@yangmengyu2021
Copy link
Contributor

我最近使用虚拟线程并发大概也就几十,和你出现了同样的问题

@yangmengyu2021
Copy link
Contributor

@toint-admin

@toint-admin
Copy link
Author

我最近使用虚拟线程并发大概也就几十,和你出现了同样的问题

尝试切换到 okhttp 的实现,我切换后到目前没发现什么异样

@yangmengyu2021
Copy link
Contributor

yangmengyu2021 commented Apr 6, 2025

我已经找到问题所在,是因为底层没有为HttpClient设置合理的connectionRequestTimeout默认值,导致瞬时稍高并发时(只要超过HttpClient默认的最大并发数,通常默认为10或20),就会造成后到的线程等待,如果仅使用平台线程去调用,这种等待无伤大雅,迟早会结束,但如果用虚拟线程,这种等待就是无限的,因为此时承载线程(carrier thread,用于虚拟线程的调度)已经被那些等待的线程占满了,即便一些连接完成请求后,也再没有多余的承载线程可以来做调度工作了,系统卡在了这样的一种尴尬境地,永远卡住。 当然我们可以通过在初始化时,为HttpClient设置一个connectionRequestTimeout,但这种类似情况可能还在其他场景中会出现,虚拟线程坑很多,如果系统不够健壮,或者整体系统用的框架并未完全拥抱响应式编程思想,最好别用虚拟线程。 @toint-admin

@yangmengyu2021
Copy link
Contributor

yangmengyu2021 commented Apr 6, 2025

Okhttp3对虚拟线程很友好,是响应式的框架,它的连接获取逻辑是非阻塞的,这也是切换成Okhttp3不会出现这个问题的原因。不过,如果坚持用虚拟线程,类似情况还可能在JDBC中遇到,因为JDBC也是基于BIO的机制,如果使用虚拟线程,在并发高的时候,如果数据库写入比较慢,可能也会出现上述尴尬的局面

@yangmengyu2021
Copy link
Contributor

yangmengyu2021 commented Apr 6, 2025

我已提交pullRequest,至少先为connectionRequestTimeout设置默认值,保证此场景下不再会出现永久假死问题,如果connectionRequestTimeout的默认值不合适,开发者也可以自行初始化 cc @binarywang

@toint-admin
Copy link
Author

我已经找到问题所在,是因为底层没有为HttpClient设置合理的connectionRequestTimeout默认值,导致瞬时稍高并发时(只要超过HttpClient默认的最大并发数,通常默认为10或20),就会造成后到的线程等待,如果仅使用平台线程去调用,这种等待无伤大雅,迟早会结束,但如果用虚拟线程,这种等待就是无限的,因为此时承载线程(carrier thread,用于虚拟线程的调度)已经被那些等待的线程占满了,即便一些连接完成请求后,也再没有多余的承载线程可以来做调度工作了,系统卡在了这样的一种尴尬境地,永远卡住。 当然我们可以通过在初始化时,为HttpClient设置一个connectionRequestTimeout,但这种类似情况可能还在其他场景中会出现,虚拟线程坑很多,如果系统不够健壮,或者整体系统用的框架并未完全拥抱响应式编程思想,最好别用虚拟线程。 @toint-admin

感谢

@toint-admin
Copy link
Author

Okhttp3对虚拟线程很友好,是响应式的框架,它的连接获取逻辑是非阻塞的,这也是切换成Okhttp3不会出现这个问题的原因。不过,如果坚持用虚拟线程,类似情况还可能在JDBC中遇到,因为JDBC也是基于BIO的机制,如果使用虚拟线程,在并发高的时候,如果数据库写入比较慢,可能也会出现上述尴尬的局面

JEP 491: Synchronize Virtual Threads without Pinning — 提高使用同步方法和语句的 Java 代码和库的可扩展性,帮助开发人员提高工作效率。该功能允许虚拟线程释放其底层平台线程,让开发人员能够访问更多的虚拟线程来管理其应用的工作负载。

@yangmengyu2021 请教一下, java24 的这个特性, 是否在一定程度上能解决您说的这个问题?

您这么一说我倒开始比较担心虚拟线程的使用安全性了, 我们现在在大量使用虚拟线程, 船大难调头了.

@yangmengyu2021
Copy link
Contributor

@toint-admin JEP 491是旨在解决虚拟线程在使用 synchronized 方法或代码块时被“钉住”(pinning)的问题。在早期的 JDK 版本中,如果虚拟线程在 synchronized 方法或块内执行阻塞操作(例如 I/O 操作),它会被固定在其承载线程(carrier thread)上,导致该承载线程无法用于其他虚拟线程,从而影响系统的可伸缩性和并发性能。  

然而,JEP 491 的改进主要针对 synchronized 关键字的使用场景。对于其他场景,即在虚拟线程中使用阻塞式的 I/O 操作(如传统的阻塞式 JDBC 驱动),即使不在 synchronized 块中,仍然可能导致承载线程被阻塞。这是因为这些阻塞式操作会直接占用承载线程,导致其他虚拟线程无法得到及时调度,从而影响系统性能。 

因此,尽管 JEP 491 在 JDK 24 中解决了与 synchronized 相关的虚拟线程钉住问题,但对于其他可能导致承载线程被阻塞的情况(如阻塞式 I/O 操作),仍需采取其他措施来避免

@yangmengyu2021
Copy link
Contributor

天下没有免费的午餐,我听过一句话:“虚拟线程是轻的,但它对你代码质量的要求很重;当我们无法掌控全局,就不要把命运交给“调度器的善意” @toint-admin

@toint-admin
Copy link
Author

@yangmengyu2021 首先感谢您的答复, 刚刚经过测试, 当使用默认的 httpclient , 升级到 jdk24 后, 该问题不再重现. 当切换回 jdk21 后, 该问题必现.

@yangmengyu2021
Copy link
Contributor

这个jdk版本太新了,担心其引入其他问题,我们还不太敢升级,不过感谢你提供的这个信息,这说明至少java官方已经关注开始解决类似问题了

@toint-admin
Copy link
Author

这个jdk版本太新了,担心其引入其他问题,我们还不太敢升级,不过感谢你提供的这个信息,这说明至少java官方已经关注开始解决类似问题了

@yangmengyu2021 是的, 测试过程中发现 lombok 还没有适配, 需要修改部分配置才可以.

jdk24 为非 LTS 版本, jdk25 为 LTS 版本.

个人感觉可以等一波 jdk25, 但是升级 jdk 意味着要长期等待大量的三方框架是否支持到这么新到版本.

请问您的项目中现在是否使用了大量的虚拟线程, 您对于未来会如何解决这个问题呢? 是回退到传统线程中去么?

@yangmengyu2021
Copy link
Contributor

目前我们项目中,并未在Tomcat层面打开虚拟线程,@schedule定时任务和@async异步方法也未打开虚拟线程特性。 想要使用虚拟线程,必须显式调用,并且调用虚拟线程是代码交叉走查的一个触发条件,在这种约束下,用虚拟线程的现象比较少

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants