Stress testing causes PDBGroupMonitor to segfault

_Background:_ this was first observed using custom-built IOC using our custom record type support and using Phoebus CSS as a client. Later, I was able to reproduce the problem entirely in `softIocPVA` with multiple `pvget -m` clients.

### Problem

With Qsrv grouping multiple PVs into NTTable (complete setup attached) and either CSS screen displaying all components in a X/Y Plot or using multiple `pvget -m` clients (*), IOC crashes with

```
Core was generated by `../../softIocPVA -D /test/softIoc/dbd/softIocPVA.dbd st.cmd'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  epicsMutex::lock (this=this@entry=0x48) at ../osi/epicsMutex.cpp:276
276	../osi/epicsMutex.cpp: No such file or directory.
[Current thread is 1 (Thread 0x7f3fe75fa700 (LWP 23040))]
(gdb) bt
#0  epicsMutex::lock (this=this@entry=0x48) at ../osi/epicsMutex.cpp:276
#1  0x000055e51b191a62 in epicsGuard<epicsMutex>::epicsGuard (mutexIn=..., this=<synthetic pointer>) at /build/3rdparty/epics/base-7/include/epicsGuard.h:143
#2  PDBGroupMonitor::requestUpdate (this=0x7f3fc800d630) at ../pdbgroup.cpp:472
#3  0x000055e51b18c6d9 in BaseMonitor::release (this=<optimized out>, elem=...) at ../../common/pvahelper.h:338
#4  0x000055e51b26e999 in epics::pvAccess::MonitorElement::Ref::reset (this=0x7f3fe75f9ab0) at ../../src/client/pv/monitor.h:186
#5  epics::pvAccess::ServerMonitorRequesterImpl::send (this=0x7f3fc800d450, buffer=<optimized out>, control=<optimized out>) at ../../src/server/responseHandlers.cpp:2039
#6  0x000055e51b1f1db0 in epics::pvAccess::detail::AbstractCodec::processSender (this=this@entry=0x7f3fd0000b20, sender=std::shared_ptr<epics::pvAccess::TransportSender> (use count 3, weak count 2) = {...}) at ../../src/remote/codec.cpp:884
#7  0x000055e51b1f31ec in epics::pvAccess::detail::AbstractCodec::processSendQueue (this=0x7f3fd0000b20) at ../../src/remote/codec.cpp:844
#8  0x000055e51b1f37e5 in epics::pvAccess::detail::AbstractCodec::processWrite (this=this@entry=0x7f3fd0000b20) at ../../src/remote/codec.cpp:754
#9  0x000055e51b1f6438 in epics::pvAccess::detail::BlockingTCPTransportCodec::sendThread (this=0x7f3fd0000b20) at ../../src/remote/codec.cpp:1157
#10 0x000055e51b3d9899 in epicsThreadCallEntryPoint (pPvt=0x7f3fd0000ca0) at ../osi/epicsThread.cpp:95
#11 0x000055e51b3dea7a in start_routine (arg=0x7f3fd001a9b0) at ../osi/os/posix/osdThread.c:442
#12 0x00007f40310666db in start_thread (arg=0x7f3fe75fa700) at pthread_create.c:463
#13 0x00007f402fdfb61f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
```

### Steps to reproduce

* start softIocPVA serving structured PV,
* start triggering PV processing as **fast as possible** (e. g. `while true; do pvput ....; done`) and continue to do so until IOC crashes,
* start multiple (7+) monitoring clients or open GUI (see image below or attached `.bob` file),
* wait for a minute
* kill clients or close GUI

#### Test setup

[NTTable_example.zip](https://github.com/user-attachments/files/18520264/NTTable_example.zip)

`softIocPVA` with PV structure:

```
$ pvget -v  myioc:hist:pva | cut -c 1-50
myioc:hist:pva epics:nt/NTTable:1.0 
    structure record
        structure _options
            uint queueSize 0
            boolean atomic true
    alarm_t alarm 
        int severity 0
        int status 0
        string message NO_ALARM
    structure timeStamp
        long secondsPastEpoch 1737632421
        int nanoseconds 721458675
        int userTag 0
    structure value
        uint[] a [1,0,1,1,1,1,0,1,3,0,0,2,0,1,1,0,
        uint[] b [0,0,2,1,0,0,1,0,0,0,0,1,0,2,4,0,
        uint[] c [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
        uint[] d [0,1,0,0,0,2,2,2,0,1,1,0,0,0,1,0,
        uint[] e [1,0,0,1,3,0,0,1,0,0,3,1,1,0,0,0,
        uint[] f [0,1,2,0,0,1,1,0,0,1,1,0,2,1,1,0,
        uint[] g [0,0,2,1,0,0,1,0,0,0,0,1,0,2,4,0,
```

CSS GUI:

<img src="https://github.com/user-attachments/assets/5fa853c7-d8c0-43da-a5b9-247a097c0438" width="200" />

(*) I was able to reproduce this using 7 `pvget -m` clients and killing them (`kill -9`) at the same time after a while.

#### Initial analysis

By adding additional traces into `modules/pva2pva/pdbApp/pdbgroup.cpp` I was able to confirm that `PDBGroupMonitor::release()` is being called after `PDBGroupMonitor::destroy()` has already been called. This happens when IOC becomes overwhelmed, and calls to `PDBGroupMonitor::release()` start occurring.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stress testing causes PDBGroupMonitor to segfault #64

Problem

Steps to reproduce

Test setup

Initial analysis

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Stress testing causes PDBGroupMonitor to segfault #64

Description

Problem

Steps to reproduce

Test setup

Initial analysis

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions