Skip to content
Merged
Show file tree
Hide file tree
Changes from 18 commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
424d3cd
added conda environment, adapted code to pytorch 2.6 and added Singul…
lorenzo9uerra Jul 1, 2025
96d3000
made load_dumps.sh idempotent
lorenzo9uerra Jul 1, 2025
bf55281
changed default artifact dir and postgres host to localhost (for clus…
lorenzo9uerra Jul 1, 2025
8481ed8
fixed error in config vals format
lorenzo9uerra Jul 1, 2025
e25789f
minor formatting changes
lorenzo9uerra Jul 1, 2025
e6e07ff
throw errors instead of continuing without any warning
lorenzo9uerra Jul 23, 2025
fd11e5a
fixed 'stats["percent_detected_attacks"]' calculation
lorenzo9uerra Jul 23, 2025
4a88a32
window size in ns not in sec
lorenzo9uerra Jul 23, 2025
5f9d749
updated detected_attacks when new attack is detected
lorenzo9uerra Jul 30, 2025
101e0ca
Merge branch 'main' of https://github.com/ubc-provenance/PIDSMaker in…
lorenzo9uerra Jul 30, 2025
a89ad7b
use specific command line arguments to configure the database
lorenzo9uerra Aug 4, 2025
0ad5a09
addition of scripts to setup a Singularity postgres container for clu…
lorenzo9uerra Aug 4, 2025
6c211fb
Removal of separate calls to set_seed for each task. The first call s…
lorenzo9uerra Aug 4, 2025
f478cc0
Minor fixes
lorenzo9uerra Aug 4, 2025
8ee45ed
added README for running using Singularity
lorenzo9uerra Aug 4, 2025
326eb1f
added comment to clarify that all code for "few_shot" is currently un…
lorenzo9uerra Aug 18, 2025
5e162b1
Removed useless code: edge_list is always None, so it's always redefi…
lorenzo9uerra Aug 19, 2025
135fdef
removed unused variable
lorenzo9uerra Aug 19, 2025
60516c5
revert: set PYTHONHASHSEED before launching Python interpreter again
lorenzo9uerra Aug 21, 2025
17f00d9
fix seed argument type
lorenzo9uerra Aug 27, 2025
7fbdea5
remove deprecated punkt in favor of punkt_tab and keep the tokenizer …
lorenzo9uerra Aug 27, 2025
1c7a393
code formatting with ruff
lorenzo9uerra Aug 27, 2025
66680ed
restore previous formatting for config.py
lorenzo9uerra Aug 27, 2025
fe7ff66
Merge branch 'ubc-provenance:main' into main
lorenzo9uerra Sep 30, 2025
0a48403
Revert "Removal of separate calls to set_seed for each task.
TristanBilot Oct 23, 2025
9f7d8a6
revert ValueError or functional tests fail
TristanBilot Oct 23, 2025
7c825fe
minor revert to still support logging args when using sweeps
TristanBilot Oct 23, 2025
b42bc60
fix tests
TristanBilot Oct 23, 2025
d806a0a
revert artifact_dir to /home/artifacts to map to Docker volume => ${A…
TristanBilot Oct 23, 2025
156ac9c
hotfix in the Docker install (not related directly to this PR but nee…
TristanBilot Oct 23, 2025
b6ee3f0
fix tests finally
TristanBilot Oct 23, 2025
3f685ac
Simplify environment.yaml and make it perfectly match the docker envi…
lorenzo9uerra Oct 23, 2025
0ef0f80
Add support for Apptainer (new name for Singularity) as well
lorenzo9uerra Oct 23, 2025
4d52f2e
Important fix: Velox’s LinearEncoder now correctly receives a tuple o…
lorenzo9uerra Oct 23, 2025
b68ef86
add missing desc
TristanBilot Oct 24, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -169,3 +169,14 @@ nohup.out
artifacts/
data/
docs/site/
postgres_data/
postgres_lock/

# Singularity files
*.sif
*.def

# Postgres directories
postgres_config/
postgres_run/
postgres_log/
2 changes: 1 addition & 1 deletion compose-postgres.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ services:
volumes:
- postgres_data:/var/lib/postgresql/data
- ./postgres/init-create-empty-databases.sh:/docker-entrypoint-initdb.d/init-create-empty-databases.sh
- ./settings/scripts:/scripts
- ./scripts:/scripts
- ${INPUT_DIR:-/data}:/data
healthcheck:
test: ["CMD-SHELL", "pg_isready -U postgres"]
Expand Down
2 changes: 0 additions & 2 deletions config/flash.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,6 @@ featurization:
epochs: 10 # 300
training_split: train
used_method: flash
use_seed: True
flash:
min_count: 1
workers: 15
Expand All @@ -41,7 +40,6 @@ detection:
used_method: none
gnn_training:
used_method: default
use_seed: True
deterministic: False
num_epochs: 12
patience: 3
Expand Down
1 change: 0 additions & 1 deletion config/kairos.yml
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,6 @@ detection:
used_method: none
gnn_training:
used_method: default
use_seed: True
deterministic: False
num_epochs: 12
patience: 3
Expand Down
1 change: 0 additions & 1 deletion config/magic.yml
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,6 @@ detection:
used_method: none
gnn_training:
used_method: default
use_seed: True
deterministic: False
num_epochs: 12
patience: 3
Expand Down
2 changes: 0 additions & 2 deletions config/nodlink.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,6 @@ featurization:
feat_training:
epochs: 100
emb_dim: 256
use_seed: True
training_split: train
used_method: fasttext
fasttext:
Expand All @@ -45,7 +44,6 @@ detection:
used_method: none
gnn_training:
used_method: default
use_seed: True
deterministic: False
num_epochs: 12
patience: 3
Expand Down
2 changes: 0 additions & 2 deletions config/orthrus.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,6 @@ featurization:
emb_dim: 128
epochs: 50
training_split: all
use_seed: True
used_method: word2vec
word2vec:
alpha: 0.025
Expand Down Expand Up @@ -58,7 +57,6 @@ detection:
used_method: none
gnn_training:
used_method: default
use_seed: True
deterministic: False
num_epochs: 12
patience: 3
Expand Down
2 changes: 0 additions & 2 deletions config/rcaid.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,6 @@ featurization:
feat_training:
epochs: 5
emb_dim: 128
use_seed: True
training_split: all
used_method: doc2vec
doc2vec:
Expand All @@ -43,7 +42,6 @@ detection:
used_method: none
gnn_training:
used_method: default
use_seed: True
deterministic: False
num_epochs: 12
patience: 3
Expand Down
1 change: 0 additions & 1 deletion config/threatrace.yml
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,6 @@ detection:
used_method: none
gnn_training:
used_method: default
use_seed: True
deterministic: False
num_epochs: 12
patience: 3
Expand Down
1 change: 0 additions & 1 deletion config/tuned_components/tuned_alacarte.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,6 @@ featurization:
feat_training:
epochs: 10
emb_dim: 128
use_seed: True
training_split: train
used_method: alacarte
alacarte:
Expand Down
1 change: 0 additions & 1 deletion config/tuned_components/tuned_doc2vec.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,6 @@ featurization:
feat_training:
epochs: 10
emb_dim: 128
use_seed: True
training_split: train
used_method: doc2vec
doc2vec:
Expand Down
1 change: 0 additions & 1 deletion config/tuned_components/tuned_fasttext.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,6 @@ featurization:
feat_training:
epochs: 10
emb_dim: 256
use_seed: True
training_split: train
used_method: fasttext
fasttext:
Expand Down
1 change: 0 additions & 1 deletion config/tuned_components/tuned_flash.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@ featurization:
epochs: 10
training_split: train
used_method: flash
use_seed: True
flash:
min_count: 1
workers: 10
1 change: 0 additions & 1 deletion config/tuned_components/tuned_word2vec.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,6 @@ featurization:
epochs: 10
emb_dim: 128
training_split: train
use_seed: True
used_method: word2vec
word2vec:
alpha: 0.025
Expand Down
10 changes: 7 additions & 3 deletions docs/docs/create-db-from-scratch.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ You can download all required files directly by running:
pip install gdown
```
```shell
./settings/scripts/download_{dataset}.sh {data_folder}
./scripts/download_{dataset}.sh {data_folder}
```
where `{dataset}` can be either `clearscope_e3`, `cadets_e3`, `theia_e3`, `clearscope_e5`, `cadets_e5` or `theia_e5` and `{data_folder}` is the absolute path to the output folder where all raw files will be downloaded.

Expand All @@ -26,14 +26,18 @@ sudo docker compose exec pids bash

4. Convert the DARPA files
```shell
./settings/scripts/uncompress_darpa_files.sh /data/
./scripts/uncompress_darpa_files.sh /data/
```

> [!NOTE]
> This may take multiple hours depending on the dataset.

### Optional configurations
- optionally, if using a specific postgres database instead of the postgres docker, update the connection config by setting `DATABASE_DEFAULT_CONFIG` within `pidsmaker/config.py`.
- optionally, if using a specific postgres database instead of the postgres docker, pass the details as command line arguments to the python scripts
- `--database_host`: the host machine where the database is located (default: `postgres`)
- `--database_user`: the database user to connect to the database (default: `postgres`)
- `--database_password`: the password for the database user (default: `postgres`)
- `--database_port`: the port number for Postgres (default: `5432`)

- optionaly, if you want to change the output folder where generated files are stored, update accordingly the volume by uncommenting `./artifacts:/home/artifacts` in `compose.yml`.

Expand Down
Empty file.
39 changes: 39 additions & 0 deletions docs/docs/singularity_install.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# Install Framework using Singularity

For quick installation on environments where Docker is not available (such as HPC clusters), you can use Singularity. This guide assumes Singularity is already installed on your system.

## Setup Process

### 1. Database Setup
The Makefile in `./scripts/Makefile` provides easy environment setup:

```bash
make full-setup
```

This command will:
- Download and run a PostgreSQL container through Singularity
- Load database dumps by executing the `load_dumps.sh` script

### 2. Container Management
Once the database is ready:
- Stop the container: `make down`
- Start it again: `make up`

### 3. Dependencies Installation
Install all required dependencies using conda:

```bash
conda env create -f ./scripts/environment.yml
conda activate pids
```

## Running the Framework

Once both the database and conda environment are ready, run the framework with:

```bash
python pidsmaker/main.py SYSTEM DATASET
```

For more details, see the [introduction](introduction.md).
1 change: 0 additions & 1 deletion docs/scripts/args/args_detection.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,6 @@
</li>
<li class='bullet'><span class="key">gnn_training</span>
<ul>
<li class='no-bullet'><span class="key-leaf">use_seed</span>: <span class="value">bool</span></li>
<li class='no-bullet'><span class="key-leaf">deterministic</span>: <span class="value">bool (19)</span></li>
<li class='no-bullet'><span class="key-leaf">num_epochs</span>: <span class="value">int</span></li>
<li class='no-bullet'><span class="key-leaf">patience</span>: <span class="value">int</span></li>
Expand Down
1 change: 0 additions & 1 deletion docs/scripts/args/args_featurization.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,6 @@
<ul>
<li class='no-bullet'><span class="key-leaf">emb_dim</span>: <span class="value">int (1)</span></li>
<li class='no-bullet'><span class="key-leaf">epochs</span>: <span class="value">int (2)</span></li>
<li class='no-bullet'><span class="key-leaf">use_seed</span>: <span class="value">bool</span></li>
<li class='no-bullet'><span class="key-leaf">training_split</span>: <span class="value">str (3)</span></li>
<li class='no-bullet'><span class="key-leaf">multi_dataset_training</span>: <span class="value">bool (4)</span></li>
<li class='no-bullet'><span class="key-leaf">used_method</span>: <span class="value">str (5)</span></li>
Expand Down
Loading