-
Notifications
You must be signed in to change notification settings - Fork 15
Description
Background: this was first observed using custom-built IOC using our custom record type support and using Phoebus CSS as a client. Later, I was able to reproduce the problem entirely in softIocPVA with multiple pvget -m clients.
Problem
With Qsrv grouping multiple PVs into NTTable (complete setup attached) and either CSS screen displaying all components in a X/Y Plot or using multiple pvget -m clients (*), IOC crashes with
Core was generated by `../../softIocPVA -D /test/softIoc/dbd/softIocPVA.dbd st.cmd'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 epicsMutex::lock (this=this@entry=0x48) at ../osi/epicsMutex.cpp:276
276 ../osi/epicsMutex.cpp: No such file or directory.
[Current thread is 1 (Thread 0x7f3fe75fa700 (LWP 23040))]
(gdb) bt
#0 epicsMutex::lock (this=this@entry=0x48) at ../osi/epicsMutex.cpp:276
#1 0x000055e51b191a62 in epicsGuard<epicsMutex>::epicsGuard (mutexIn=..., this=<synthetic pointer>) at /build/3rdparty/epics/base-7/include/epicsGuard.h:143
#2 PDBGroupMonitor::requestUpdate (this=0x7f3fc800d630) at ../pdbgroup.cpp:472
#3 0x000055e51b18c6d9 in BaseMonitor::release (this=<optimized out>, elem=...) at ../../common/pvahelper.h:338
#4 0x000055e51b26e999 in epics::pvAccess::MonitorElement::Ref::reset (this=0x7f3fe75f9ab0) at ../../src/client/pv/monitor.h:186
#5 epics::pvAccess::ServerMonitorRequesterImpl::send (this=0x7f3fc800d450, buffer=<optimized out>, control=<optimized out>) at ../../src/server/responseHandlers.cpp:2039
#6 0x000055e51b1f1db0 in epics::pvAccess::detail::AbstractCodec::processSender (this=this@entry=0x7f3fd0000b20, sender=std::shared_ptr<epics::pvAccess::TransportSender> (use count 3, weak count 2) = {...}) at ../../src/remote/codec.cpp:884
#7 0x000055e51b1f31ec in epics::pvAccess::detail::AbstractCodec::processSendQueue (this=0x7f3fd0000b20) at ../../src/remote/codec.cpp:844
#8 0x000055e51b1f37e5 in epics::pvAccess::detail::AbstractCodec::processWrite (this=this@entry=0x7f3fd0000b20) at ../../src/remote/codec.cpp:754
#9 0x000055e51b1f6438 in epics::pvAccess::detail::BlockingTCPTransportCodec::sendThread (this=0x7f3fd0000b20) at ../../src/remote/codec.cpp:1157
#10 0x000055e51b3d9899 in epicsThreadCallEntryPoint (pPvt=0x7f3fd0000ca0) at ../osi/epicsThread.cpp:95
#11 0x000055e51b3dea7a in start_routine (arg=0x7f3fd001a9b0) at ../osi/os/posix/osdThread.c:442
#12 0x00007f40310666db in start_thread (arg=0x7f3fe75fa700) at pthread_create.c:463
#13 0x00007f402fdfb61f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
Steps to reproduce
- start softIocPVA serving structured PV,
- start triggering PV processing as fast as possible (e. g.
while true; do pvput ....; done) and continue to do so until IOC crashes, - start multiple (7+) monitoring clients or open GUI (see image below or attached
.bobfile), - wait for a minute
- kill clients or close GUI
Test setup
softIocPVA with PV structure:
$ pvget -v myioc:hist:pva | cut -c 1-50
myioc:hist:pva epics:nt/NTTable:1.0
structure record
structure _options
uint queueSize 0
boolean atomic true
alarm_t alarm
int severity 0
int status 0
string message NO_ALARM
structure timeStamp
long secondsPastEpoch 1737632421
int nanoseconds 721458675
int userTag 0
structure value
uint[] a [1,0,1,1,1,1,0,1,3,0,0,2,0,1,1,0,
uint[] b [0,0,2,1,0,0,1,0,0,0,0,1,0,2,4,0,
uint[] c [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
uint[] d [0,1,0,0,0,2,2,2,0,1,1,0,0,0,1,0,
uint[] e [1,0,0,1,3,0,0,1,0,0,3,1,1,0,0,0,
uint[] f [0,1,2,0,0,1,1,0,0,1,1,0,2,1,1,0,
uint[] g [0,0,2,1,0,0,1,0,0,0,0,1,0,2,4,0,
CSS GUI:
(*) I was able to reproduce this using 7 pvget -m clients and killing them (kill -9) at the same time after a while.
Initial analysis
By adding additional traces into modules/pva2pva/pdbApp/pdbgroup.cpp I was able to confirm that PDBGroupMonitor::release() is being called after PDBGroupMonitor::destroy() has already been called. This happens when IOC becomes overwhelmed, and calls to PDBGroupMonitor::release() start occurring.