Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suricata dies every 16-18h due to "failed to reclaim memory" (in 16GB RAM machine) #8358

Open
2 tasks done
githstl opened this issue Feb 19, 2025 · 1 comment
Open
2 tasks done

Comments

@githstl
Copy link

githstl commented Feb 19, 2025

Important notices

Before you add a new report, we ask you kindly to acknowledge the following:

Describe the bug

Suricata proces dies every 16-18h due to "failed to reclaim memory". Suricata process is growing to about 11-12GB of memory use, then OOM Killer do his job.
Meantime the "Laundry" memory (visible in top command stats) appears to about 1.5GB

The last OPNsense version where the bug did not exist: I don't know, because it is fresh start in 25.x - not able to test in 24.x or older

It seems that it is known bug, details here
https://redmine.openinfosecfoundation.org/issues/6963

To Reproduce

Steps to reproduce the behavior:

  1. subscribe to ET Pro telemetry
  2. enable ET Pro telemetry rules
  3. Enable IDS (in IDS mode or IPS - it doesn't matter)
  4. set update rules to 3-4 hours
  5. apply two policies: First - change Alert to Drop for emerged-exploit,ftp,malware,phishing rules. Second - change Alert to Drop for activities: attempt-admin, dos,recon etc for Critial, Major and Minor signature_severity
  6. wait (or simulate applying updated rules by doing (from shell): pkill -USR2 suricata

When Suricata started (the rule loading and conversion is complete), the process is using about 4.6GB of memory.

Expected behavior

Suricata should stay alive, not consuming all memory

Describe alternatives you considered

Tried workarounds like this - not helped:

    smb:
      enabled: yes
      detection-ports:
        dp: 139, 445

      # Stream reassembly size for SMB streams. By default track it completely.
      # limited to avoid memory exhaustion
      stream-depth: 32mb

It doesn't matter if Suricata runs with emulated netmap (dev.netmap.admode=2) or not.

Screenshots

Memory usage by suricata just before OOM stepped in.
Image

Relevant log files

<13>1 2025-02-19T18:13:05+01:00 OPNs1.xyx.com rule-updater.py 77453 - [meta sequenceId="53"] download skipped emerging-web_server.rules, same version
<13>1 2025-02-19T18:13:05+01:00 OPNs1.xyx.com rule-updater.py 77453 - [meta sequenceId="54"] download skipped emerging-web_specific_apps.rules, same version
<13>1 2025-02-19T18:13:05+01:00 OPNs1.xyx.com rule-updater.py 77453 - [meta sequenceId="55"] download skipped emerging-worm.rules, same version
<13>1 2025-02-19T18:51:03+01:00 OPNs1.xyx.com kernel - - [meta sequenceId="1"] <3>pid 70333 (suricata), jid 0, uid 0, was killed: failed to reclaim memory

Additional context

The rule update task is set to every 3 hours and after each run, the Suricata process memory usage grows (it is never shrinking).

Has to run now in IDS mode, because when Suricata dies in IPS mode - it disrupt traffic heavily.
Cron task restarts Suricata every 6h

OPNsense router "see" traffic from 120pcs, about 40-50Mbps.

Environment

Software version used and hardware type if relevant, e.g.:

OPNsense 25.1.1-amd64, 16GB RAM.
Xeon E3-1200 v5/v6 / E3-1500 v5 / 6th/7th/8th Gen Core Processor Gaussian Mixture Model
I350 Gigabit Fiber Network Connection
I211 Gigabit Network Connection

@githstl
Copy link
Author

githstl commented Feb 25, 2025

More findings:

  • the default OPNSense setup has log configuration to use 50% of memory for ramdisk (/var/log) to store logs (in my case about 8GB of ram)
  • the last matching rule of default deny any and default pass is configured (by default) to do logging (/var/log/filter/filter*.log)
  • so after 16-20hours of working, the /var/log ramdisk is about full usage.
  • then schedule task for Suricata is fired, new rules downloaded and applied
  • based that (in my case) Suricata i using about 6.5-7GB of RAM, it use about 10-11GB during new rules applying process
  • 8GB of ramdisk + 10GB Suricata process = OOM killer job

PS: I reconfigured /var/log ram disk to take 3.5GB and 4 days log rotating, then the remote logging is configured to store logs for longer time. I will watch, what will happen to Suricata then

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

1 participant