Skip to content

Commit e13e345

Browse files
authored
Add support for a CPU-only Mode (nv-morpheus#1851)
* Adds a new enum `morpheus.config.ExecutionMode` with members `GPU` & `CPU` along with a new `morpheus.config.Config.execution_mode` attribute. * For backwards compatibility, by default `Config.execution_mode` will always default to `GPU` * Add new `supported_execution_modes` to `StageBase` which returns `ExecutionMode.GPU` by default. This ensures that building a pipeline with a stage not matching the execution mode will raise a reasonable error to the user. * Add `CpuOnlyMixin` and `GpuAndCpuMixin` mixins to automate overriding this, and makes it easier for users to determine which execution modes a given stage supports at a glance. * Since C++ Stage/Message impls can only support cuDF DataFrames, and RMM tensors, this PR re-purposes the existing Python stage/message impls mode to serve as CPU-only mode. * CPU-only mode will center around pandas DataFrames and NumPy arrays for tensors, since the current Python code which expects cuDF/CuPy is already 99% compatible with pandas/NumPy. * Avoid importing `cudf` or any other GPU based package which will fail on import at the top-level of a module. This is important for stage, message and modules which are automatically imported by the morpheus CLI tool. * Add new utility methods to `morpheus.utils.type_utils` (ex: `get_df_pkg`, `is_cudf_type`) to help avoid importing cudf directly * Add a new `Config.freeze` method which will make a config object immutable. This will be called the first time a config object is used to construct a pipeline or stage object. Prevents the possibility of config parameters from being changed in the middle of pipeline construction. * `CudfHelper::load` is no longer called automatically on import, instead it is called manually on pipeline build when execution mode is GPU. * Add Python implementation of `ControlMessage` * To simulate a system without a GPU to test CPU-only mode, if the `CPU_ONLY` environment variable is defined `docker/run_container_dev.sh` will launch the container using the `runc` runtime. * Remove automatic test parameterization of C++/Python mode, since supporting CPU-only mode will become the exception not the rule. Add a new `gpu_and_cpu_mode` test marker to explicitly indicate a test intended to be parameterized over execution modes. * Fix copy constructor for `ControlMessage` * `AppShieldSourceStage` now emits `ControlMessage`s, `AppShieldMessageMeta` is now deprecated * `AutoencoderSourceStage` and thus `AzureSourceStage`, `CloudTrailSourceStage`, and `DuoSourceStage` now emit `ControlMessage`, `UserMessageMeta` is now deprecated. * DFP production pipeline updated to remove `DFPMessageMeta`, pipeline now executes in C++ mode. * Consolidate common logig in `docker/run_container_dev.sh` & `docker/run_container_release.sh` into `docker/run_container.sh` * Remove inconsistent behavior in the Python impl of `TensorMemory.set_tensor` (nv-morpheus#1955) Closes nv-morpheus#1646 Closes nv-morpheus#1846 Closes nv-morpheus#1852 Closes nv-morpheus#1955 ## By Submitting this PR I confirm: - I am familiar with the [Contributing Guidelines](https://github.com/nv-morpheus/Morpheus/blob/main/docs/source/developer_guide/contributing.md). - When the PR is ready for review, new or existing tests cover these changes. - When the PR is ready for review, the documentation is up to date with these changes. Authors: - David Gardner (https://github.com/dagardner-nv) - Yuchen Zhang (https://github.com/yczhang-nv) Approvers: - Michael Demoret (https://github.com/mdemoret-nv) URL: nv-morpheus#1851
1 parent 5692639 commit e13e345

File tree

266 files changed

+3257
-3145
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

266 files changed

+3257
-3145
lines changed

ci/vale/styles/config/vocabularies/morpheus/accept.txt

+5
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,9 @@ CMake
1818
Conda
1919
CPython
2020
[Cc]ryptocurrenc[y|ies]
21+
cuDF
22+
cuML
23+
CuPy
2124
[Cc]yber
2225
[Cc]ybersecurity
2326
Cython
@@ -51,7 +54,9 @@ NeMo
5154
nginx
5255
NIC
5356
NIM(s?)
57+
NumPy
5458
NVIDIA
59+
pandas
5560
[Pp]arallelization
5661
[Pp]arsable
5762
PCIe

docker/run_container.sh

+57
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,57 @@
1+
#!/bin/bash
2+
# SPDX-FileCopyrightText: Copyright (c) 2021-2024, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
3+
# SPDX-License-Identifier: Apache-2.0
4+
#
5+
# Licensed under the Apache License, Version 2.0 (the "License");
6+
# you may not use this file except in compliance with the License.
7+
# You may obtain a copy of the License at
8+
#
9+
# http://www.apache.org/licenses/LICENSE-2.0
10+
#
11+
# Unless required by applicable law or agreed to in writing, software
12+
# distributed under the License is distributed on an "AS IS" BASIS,
13+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14+
# See the License for the specific language governing permissions and
15+
# limitations under the License.
16+
17+
# Color variables
18+
b="\033[0;36m"
19+
g="\033[0;32m"
20+
r="\033[0;31m"
21+
e="\033[0;90m"
22+
y="\033[0;33m"
23+
x="\033[0m"
24+
25+
_UNDEF_VAR_ERROR_MSG="Use the dev/release scripts to set these automatically"
26+
27+
DOCKER_IMAGE_NAME=${DOCKER_IMAGE_NAME:?"Must set \$DOCKER_IMAGE_NAME. ${_UNDEF_VAR_ERROR_MSG}"}
28+
DOCKER_IMAGE_TAG=${DOCKER_IMAGE_TAG:?"Must set \$DOCKER_IMAGE_TAG. ${_UNDEF_VAR_ERROR_MSG}"}
29+
30+
# DOCKER_ARGS are set by the dev/release scripts
31+
# DOCKER_EXTRA_ARGS are optionally set by the user
32+
DOCKER_ARGS=${DOCKER_ARGS:-""}
33+
DOCKER_ARGS="${DOCKER_ARGS} --net=host --cap-add=sys_nice ${DOCKER_EXTRA_ARGS}"
34+
DOCKER_EXTRA_ARGS=${DOCKER_EXTRA_ARGS:-""}
35+
36+
if [[ -n "${CPU_ONLY}" ]]; then
37+
echo -e "${b}Executing in CPU only mode${x}"
38+
DOCKER_ARGS="${DOCKER_ARGS} --runtime=runc"
39+
else
40+
echo -e "${b}Executing in GPU mode${x}"
41+
DOCKER_ARGS="${DOCKER_ARGS} --runtime=nvidia --gpus=all"
42+
fi
43+
44+
if [[ -n "${SSH_AUTH_SOCK}" ]]; then
45+
echo -e "${b}Setting up ssh-agent auth socket${x}"
46+
DOCKER_ARGS="${DOCKER_ARGS} -v $(readlink -f $SSH_AUTH_SOCK):/ssh-agent:ro -e SSH_AUTH_SOCK=/ssh-agent"
47+
fi
48+
49+
echo -e "${g}Launching ${DOCKER_IMAGE_NAME}:${DOCKER_IMAGE_TAG}...${x}"
50+
51+
# Enable command logging to show what is being executed
52+
set -x
53+
docker run ${DOCA_EXTRA_ARGS} --rm -ti ${DOCKER_ARGS} ${DOCKER_IMAGE_NAME}:${DOCKER_IMAGE_TAG} "${@:-bash}"
54+
55+
{ EXIT_CODE=$?; set +x; } 2>/dev/null
56+
57+
exit $EXIT_CODE

docker/run_container_dev.sh

+6-32
Original file line numberDiff line numberDiff line change
@@ -14,38 +14,12 @@
1414
# See the License for the specific language governing permissions and
1515
# limitations under the License.
1616

17-
# set -x
17+
SCRIPT_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" &> /dev/null && pwd )"
1818

19-
# Color variables
20-
b="\033[0;36m"
21-
g="\033[0;32m"
22-
r="\033[0;31m"
23-
e="\033[0;90m"
24-
y="\033[0;33m"
25-
x="\033[0m"
19+
export DOCKER_IMAGE_NAME=${DOCKER_IMAGE_NAME:-"morpheus"}
20+
export DOCKER_IMAGE_TAG=${DOCKER_IMAGE_TAG:-"dev-$(date +'%y%m%d')"}
2621

27-
DOCKER_IMAGE_NAME=${DOCKER_IMAGE_NAME:-"morpheus"}
28-
DOCKER_IMAGE_TAG=${DOCKER_IMAGE_TAG:-"dev-$(date +'%y%m%d')"}
29-
DOCKER_EXTRA_ARGS=${DOCKER_EXTRA_ARGS:-""}
22+
export DOCKER_ARGS="-v $PWD:/workspace -v /dev/hugepages:/dev/hugepages --privileged"
3023

31-
DOCKER_ARGS="--runtime=nvidia --env WORKSPACE_VOLUME=${PWD} -v $PWD:/workspace --net=host --gpus=all --cap-add=sys_nice"
32-
33-
if [[ -n "${SSH_AUTH_SOCK}" ]]; then
34-
echo -e "${b}Setting up ssh-agent auth socket${x}"
35-
DOCKER_ARGS="${DOCKER_ARGS} -v $(readlink -f $SSH_AUTH_SOCK):/ssh-agent:ro -e SSH_AUTH_SOCK=/ssh-agent"
36-
fi
37-
38-
echo -e "${g}Launching ${DOCKER_IMAGE_NAME}:${DOCKER_IMAGE_TAG}...${x}"
39-
40-
set -x
41-
docker run \
42-
-v /dev/hugepages:/dev/hugepages \
43-
--privileged \
44-
--rm \
45-
-ti \
46-
${DOCKER_ARGS} ${DOCKER_EXTRA_ARGS} \
47-
${DOCKER_IMAGE_NAME}:${DOCKER_IMAGE_TAG} "${@:-bash}"
48-
49-
{ EXIT_CODE=$?; set +x; } 2>/dev/null
50-
51-
exit $EXIT_CODE
24+
# Call the general run script
25+
${SCRIPT_DIR}/run_container.sh

docker/run_container_release.sh

+7-32
Original file line numberDiff line numberDiff line change
@@ -16,48 +16,23 @@
1616

1717
SCRIPT_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" &> /dev/null && pwd )"
1818

19-
# Color variables
20-
b="\033[0;36m"
21-
g="\033[0;32m"
22-
r="\033[0;31m"
23-
e="\033[0;90m"
24-
y="\033[0;33m"
25-
x="\033[0m"
26-
2719
# Change to the script file to ensure we are in the correct repo (in case were in a submodule)
2820
pushd ${SCRIPT_DIR} &> /dev/null
2921

3022
MORPHEUS_SUPPORT_DOCA=${MORPHEUS_SUPPORT_DOCA:-OFF}
31-
MORPHEUS_BUILD_MORPHEUS_LLM=${MORPHEUS_BUILD_MORPHEUS_LLM:-ON}
32-
MORPHEUS_BUILD_MORPHEUS_DFP=${MORPHEUS_BUILD_MORPHEUS_DFP:-ON}
33-
34-
DOCKER_IMAGE_NAME=${DOCKER_IMAGE_NAME:-"nvcr.io/nvidia/morpheus/morpheus"}
35-
DOCKER_IMAGE_TAG=${DOCKER_IMAGE_TAG:-"$(git describe --tags --abbrev=0)-runtime"}
3623

37-
# This variable is used for passing extra arguments to the docker run command. Do not use DOCKER_ARGS for this purpose.
38-
DOCKER_EXTRA_ARGS=${DOCKER_EXTRA_ARGS:-""}
24+
export DOCKER_IMAGE_NAME=${DOCKER_IMAGE_NAME:-"nvcr.io/nvidia/morpheus/morpheus"}
25+
export DOCKER_IMAGE_TAG=${DOCKER_IMAGE_TAG:-"$(git describe --tags --abbrev=0)-runtime"}
3926

4027
popd &> /dev/null
4128

42-
DOCKER_ARGS="--runtime=nvidia --env WORKSPACE_VOLUME=${PWD} --net=host --gpus=all --cap-add=sys_nice ${DOCKER_EXTRA_ARGS}"
43-
44-
if [[ -n "${SSH_AUTH_SOCK}" ]]; then
45-
echo -e "${b}Setting up ssh-agent auth socket${x}"
46-
DOCKER_ARGS="${DOCKER_ARGS} -v $(readlink -f $SSH_AUTH_SOCK):/ssh-agent:ro -e SSH_AUTH_SOCK=/ssh-agent"
47-
fi
48-
49-
# DPDK requires hugepage and privileged container
50-
DOCA_EXTRA_ARGS=""
29+
# DPDK (and thus DOCA) requires hugepage and privileged container
30+
export DOCKER_ARGS=""
5131
if [[ ${MORPHEUS_SUPPORT_DOCA} == @(TRUE|ON) ]]; then
52-
echo -e "${b}Enabling DOCA Support. Mounting /dev/hugepages and running in privileged mode${x}"
32+
echo -e "Enabling DOCA Support. Mounting /dev/hugepages and running in privileged mode"
5333

5434
DOCKER_ARGS="${DOCKER_ARGS} -v /dev/hugepages:/dev/hugepages --privileged"
5535
fi
5636

57-
58-
echo -e "${g}Launching ${DOCKER_IMAGE_NAME}:${DOCKER_IMAGE_TAG}...${x}"
59-
60-
# Enable command logging to show what is being executed
61-
set -x
62-
docker run ${DOCA_EXTRA_ARGS} --rm -ti ${DOCKER_ARGS} ${DOCKER_IMAGE_NAME}:${DOCKER_IMAGE_TAG} "${@:-bash}"
63-
set +x
37+
# Call the general run script
38+
${SCRIPT_DIR}/run_container.sh

docs/source/conf.py

+1
Original file line numberDiff line numberDiff line change
@@ -199,6 +199,7 @@
199199
r'^http://$',
200200
r'^https://$',
201201
r'https://(platform\.)?openai.com',
202+
r'https://code.visualstudio.com'
202203
]
203204

204205
# Add any paths that contain templates here, relative to this directory.

docs/source/developer_guide/guides/2_real_world_phishing.md

+3-2
Original file line numberDiff line numberDiff line change
@@ -980,7 +980,7 @@ The code for our sink will be similar to other stages with a few changes. First,
980980

981981
```python
982982
@register_stage("to-rabbitmq")
983-
class WriteToRabbitMQStage(PassThruTypeMixin, SinglePortStage):
983+
class WriteToRabbitMQStage(PassThruTypeMixin, GpuAndCpuMixin, SinglePortStage):
984984
```
985985

986986
Our sink will function as a pass-through allowing the possibility of other sinks to be added to the pipeline. We could, hypothetically, have a pipeline where we emit the results to both RabbitMQ and a file. For this reason we will also be using the `PassThruTypeMixin`.
@@ -1032,14 +1032,15 @@ import pika
10321032
from morpheus.cli.register_stage import register_stage
10331033
from morpheus.config import Config
10341034
from morpheus.messages.message_meta import MessageMeta
1035+
from morpheus.pipeline.execution_mode_mixins import GpuAndCpuMixin
10351036
from morpheus.pipeline.pass_thru_type_mixin import PassThruTypeMixin
10361037
from morpheus.pipeline.single_port_stage import SinglePortStage
10371038

10381039
logger = logging.getLogger(__name__)
10391040

10401041

10411042
@register_stage("to-rabbitmq")
1042-
class WriteToRabbitMQStage(PassThruTypeMixin, SinglePortStage):
1043+
class WriteToRabbitMQStage(PassThruTypeMixin, GpuAndCpuMixin, SinglePortStage):
10431044
"""
10441045
Source stage used to load messages from a RabbitMQ queue.
10451046

docs/source/developer_guide/guides/6_digital_fingerprinting_reference.md

+5-5
Original file line numberDiff line numberDiff line change
@@ -88,15 +88,15 @@ Defines a single column and type-cast.
8888
| Argument | Type | Description |
8989
| -------- | ---- | ----------- |
9090
| `name` | `str` | Name of the column |
91-
| `dtype` | `str` or Python type | Any type string or Python class recognized by [Pandas](https://pandas.pydata.org/docs/user_guide/basics.html#dtypes) |
91+
| `dtype` | `str` or Python type | Any type string or Python class recognized by [pandas](https://pandas.pydata.org/docs/user_guide/basics.html#dtypes) |
9292

9393
#### Custom Column (`CustomColumn`)
9494
Subclass of `ColumnInfo`, defines a column to be computed by a user-defined function `process_column_fn`.
9595

9696
| Argument | Type | Description |
9797
| -------- | ---- | ----------- |
9898
| `name` | `str` | Name of the column |
99-
| `dtype` | `str` or Python type | Any type string or Python class recognized by [Pandas](https://pandas.pydata.org/docs/user_guide/basics.html#dtypes) |
99+
| `dtype` | `str` or Python type | Any type string or Python class recognized by [pandas](https://pandas.pydata.org/docs/user_guide/basics.html#dtypes) |
100100
| `process_column_fn` | `function` | Function which receives the entire `DataFrame` as its only input, returning a new [`pandas.Series`](https://pandas.pydata.org/docs/reference/api/pandas.Series.html) object to be stored in column `name`. |
101101
| `input_column_types` | `dict[str, str]` | The input columns and the expected [`dtype` strings](https://pandas.pydata.org/docs/user_guide/basics.html#dtypes) that are needed for this Column to successfully process. Setting this as `None` will pass all columns. Specifying which columns are needed improves performance. |
102102

@@ -139,7 +139,7 @@ Subclass of `RenameColumn`, specific to casting UTC localized `datetime` values.
139139
| Argument | Type | Description |
140140
| -------- | ---- | ----------- |
141141
| `name` | `str` | Name of the destination column |
142-
| `dtype` | `str` or Python type | Any type string or Python class recognized by [Pandas](https://pandas.pydata.org/docs/user_guide/basics.html#dtypes) |
142+
| `dtype` | `str` or Python type | Any type string or Python class recognized by [pandas](https://pandas.pydata.org/docs/user_guide/basics.html#dtypes) |
143143
| `input_name` | `str` | Original column name |
144144

145145
#### String-Join Column (`StringJoinColumn`)
@@ -148,7 +148,7 @@ Subclass of `RenameColumn`, converts incoming `list` values to string by joining
148148
| Argument | Type | Description |
149149
| -------- | ---- | ----------- |
150150
| `name` | `str` | Name of the destination column |
151-
| `dtype` | `str` or Python type | Any type string or Python class recognized by [Pandas](https://pandas.pydata.org/docs/user_guide/basics.html#dtypes) |
151+
| `dtype` | `str` or Python type | Any type string or Python class recognized by [pandas](https://pandas.pydata.org/docs/user_guide/basics.html#dtypes) |
152152
| `input_name` | `str` | Original column name |
153153
| `sep` | `str` | Separator string to use for the join |
154154

@@ -158,7 +158,7 @@ Subclass of `ColumnInfo`, concatenates values from multiple columns into a new s
158158
| Argument | Type | Description |
159159
| -------- | ---- | ----------- |
160160
| `name` | `str` | Name of the destination column |
161-
| `dtype` | `str` or Python type | Any type string or Python class recognized by [Pandas](https://pandas.pydata.org/docs/user_guide/basics.html#dtypes) |
161+
| `dtype` | `str` or Python type | Any type string or Python class recognized by [pandas](https://pandas.pydata.org/docs/user_guide/basics.html#dtypes) |
162162
| `input_columns` | `List[str]` | List of columns to concatenate |
163163
| `sep` | `str` | Separator string |
164164

examples/abp_pcap_detection/abp_pcap_preprocessing.py

+2-6
Original file line numberDiff line numberDiff line change
@@ -16,17 +16,16 @@
1616
from functools import partial
1717

1818
import cupy as cp
19-
import mrc
2019
import numpy as np
2120

2221
import cudf
2322

24-
import morpheus._lib.messages as _messages
2523
from morpheus.cli.register_stage import register_stage
2624
from morpheus.common import TypeId
2725
from morpheus.config import Config
2826
from morpheus.config import PipelineModes
2927
from morpheus.messages import ControlMessage
28+
from morpheus.messages import InferenceMemoryFIL
3029
from morpheus.stages.preprocess.preprocess_base_stage import PreprocessBaseStage
3130

3231

@@ -184,7 +183,7 @@ def round_time_kernel(timestamp, rollup_time, secs):
184183
seq_ids[:, 2] = fea_len - 1
185184

186185
# Create the inference memory. Keep in mind count here could be > than input count
187-
memory = _messages.InferenceMemoryFIL(count=count, input__0=data, seq_ids=seq_ids)
186+
memory = InferenceMemoryFIL(count=count, input__0=data, seq_ids=seq_ids)
188187

189188
infer_message = ControlMessage(msg)
190189
infer_message.payload(meta)
@@ -197,6 +196,3 @@ def _get_preprocess_fn(self) -> typing.Callable[[ControlMessage], ControlMessage
197196
fea_len=self._fea_length,
198197
fea_cols=self.features,
199198
req_cols=self.req_cols)
200-
201-
def _get_preprocess_node(self, builder: mrc.Builder):
202-
raise NotImplementedError("C++ node not implemented for this stage")

0 commit comments

Comments
 (0)