Skip to content

Commit f34583c

Browse files
authored
Update documentation (#133)
* Use AICc for BOCPD instead of likelihood. * Update feature table. * Set maximum maxlags to 500. * More stringent failure conditions for tests. Test failures weren't causing the test job to fail. * Make ETS prediction intervals more flexible. * Add more robust documentation for spark-on-k8s * Add a basic Dockerfile. * More info on specifying a Spark app. * Add additional levels of organization to API docs. * Note which models support exogenous regressors. * Remove last 'submodules' in doc. * More systematic exclusion of dashboard from pytest * Improve implementation of moving average inverse.
1 parent c0c852e commit f34583c

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

43 files changed

+795
-450
lines changed

.dockerignore

-1
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,6 @@
11
# package
22
__pycache__
33
*.egg-info
4-
data
54
docs
65
tmp
76
ts_datasets

.github/workflows/tests.yml

+5-3
Original file line numberDiff line numberDiff line change
@@ -39,15 +39,17 @@ jobs:
3939
PYTHON_VERSION: ${{ matrix.python-version }}
4040
with:
4141
max_attempts: 3
42-
timeout_minutes: 40
42+
timeout_minutes: 60
43+
retry-on: error
4344
command: |
45+
set -euxo pipefail
4446
# Get a comma-separated list of the directories of all python source files
4547
files=$(for f in $(find merlion -iname "*.py"); do echo -n ",$f"; done)
46-
script="import os; print(','.join({os.path.dirname(f) for f in '$files'.split(',') if f and 'dashboard' not in f}))"
48+
script="import os; print(','.join({os.path.dirname(f) for f in '$files'.split(',') if f}))"
4749
source_modules=$(python -c "$script")
4850
4951
# Run tests & obtain code coverage from coverage report.
50-
coverage run --source=${source_modules} -L -m pytest -v -s
52+
coverage run --source=${source_modules} --omit=merlion/dashboard/* -L -m pytest -v -s
5153
coverage report && coverage xml -o .github/badges/coverage.xml
5254
COVERAGE=`coverage report | grep "TOTAL" | grep -Eo "[0-9\.]+%"`
5355
echo "coverage=${COVERAGE}" >> $GITHUB_OUTPUT

README.md

+27-15
Original file line numberDiff line numberDiff line change
@@ -39,34 +39,46 @@ across multiple time series datasets.
3939

4040
Merlion's key features are
4141
- Standardized and easily extensible data loading & benchmarking for a wide range of forecasting and anomaly
42-
detection datasets.
43-
- A library of diverse models for both anomaly detection and forecasting, unified under a shared interface.
44-
Models include classic statistical methods, tree ensembles, and deep
42+
detection datasets. This includes transparent support for custom datasets.
43+
- A library of diverse models for anomaly detection, forecasting, and change point detection, all
44+
unified under a shared interface. Models include classic statistical methods, tree ensembles, and deep
4545
learning approaches. Advanced users may fully configure each model as desired.
4646
- Abstract `DefaultDetector` and `DefaultForecaster` models that are efficient, robustly achieve good performance,
4747
and provide a starting point for new users.
4848
- AutoML for automated hyperaparameter tuning and model selection.
49+
- Unified API for using a wide range of models to forecast with
50+
[exogenous regressors](https://opensource.salesforce.com/Merlion/tutorials/forecast/3_ForecastExogenous.html).
4951
- Practical, industry-inspired post-processing rules for anomaly detectors that make anomaly scores more interpretable,
5052
while also reducing the number of false positives.
5153
- Easy-to-use ensembles that combine the outputs of multiple models to achieve more robust performance.
5254
- Flexible evaluation pipelines that simulate the live deployment & re-training of a model in production,
5355
and evaluate performance on both forecasting and anomaly detection.
54-
- Native support for visualizing model predictions.
56+
- Native support for visualizing model predictions, including with a clickable visual UI.
57+
- Distributed computation [backend](https://opensource.salesforce.com/Merlion/merlion.spark.html) using PySpark,
58+
which can be used to serve time series applications at industrial scale.
5559

5660
The table below provides a visual overview of how Merlion's key features compare to other libraries for time series
5761
anomaly detection and/or forecasting.
5862

59-
| | Merlion | Prophet | Alibi Detect | Kats | statsmodels | GluonTS | RRCF | STUMPY | Greykite |pmdarima
60-
:--- | :---: | :---:| :---: | :---: | :---: | :---: | :---: | :---: | :----: | :---:
61-
| Univariate Forecasting | ✅ | ✅ | | ✅ | ✅ | ✅ | | |✅ | ✅
62-
| Multivariate Forecasting || | |||| | | | |
63-
| Univariate Anomaly Detection ||||| | |||||
64-
| Multivariate Anomaly Detection || ||| | ||| | | |
65-
| Change Point Detection ||||| | | | || |
66-
| AutoML | ✅ | | | ✅ | | | | | ✅ | | ✅
67-
| Ensembles || | | | | || | | |
68-
| Benchmarking || | | | || | | | |
69-
| Visualization ||| || | | | ||| |
63+
| | Merlion | Prophet | Alibi Detect | Kats | statsmodels | nixtla | GluonTS | RRCF | STUMPY | Greykite |pmdarima
64+
:--- | :---: | :---:| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :----: | :---:
65+
| Univariate Forecasting | ✅ | ✅| | ✅ | ✅ | ✅ | ✅ | | |✅| ✅
66+
| Multivariate Forecasting || | ||||| | | | |
67+
| Univariate Anomaly Detection ||||| | | |||||
68+
| Multivariate Anomaly Detection || ||| | | ||| | | |
69+
| AutoML | ✅ | | | ✅ | | | | | | | ✅ | | ✅
70+
| Ensembles || | || | | | || | | |
71+
| Benchmarking || | | ||| | | | |
72+
| Visualization ||| || | | | | ||| |
73+
74+
The following features are new in Merlion 2.0:
75+
76+
| | Merlion | Prophet | Alibi Detect | Kats | statsmodels | nixtla | GluonTS | RRCF | STUMPY | Greykite |pmdarima
77+
:--- | :---: | :---:| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :----: | :---:
78+
| Exogenous Regressors | ✅ | ✅ | | | ✅ | | | | | ✅ | ✅
79+
| Change Point Detection ||||| | | | | ||
80+
| Clickable Visual UI || | | | | | | | | |
81+
| Distributed Backend || | | | || | | | |
7082

7183
## Installation
7284

docker/Dockerfile

+14
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
FROM python:3.9-slim
2+
WORKDIR /opt/Merlion
3+
# Install Java
4+
RUN rm -rf /var/lib/apt/lists/* && \
5+
apt-get clean && \
6+
apt-get update && \
7+
apt-get upgrade && \
8+
apt-get install -y --no-install-recommends openjdk-11-jre-headless && \
9+
rm -rf /var/lib/apt/lists/*
10+
# Install Merlion from source
11+
COPY *.md ./
12+
COPY setup.py ./
13+
COPY merlion merlion
14+
RUN pip install "./"

docker/spark-on-k8s/Dockerfile

+2-1
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,7 @@ COPY merlion merlion
1212
RUN pip install pyarrow "./"
1313

1414
# Copy Merlion pyspark apps
15-
COPY apps /opt/spark/apps
15+
COPY spark_apps /opt/spark/apps
16+
COPY data/walmart/walmart_mini.csv .
1617
RUN chmod g+w /opt/spark/apps
1718
USER ${spark_uid}

docs/source/index.rst

+3-3
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ Merlion is a Python library for time series intelligence. It features a unified
1111
point detection on both univariate and multivariate time series, along with standard
1212
:doc:`pre-processing <merlion.transform>` and :doc:`post-processing <merlion.post_process>` layers.
1313
It has several modules to improve ease-of-use,
14-
including :ref:`visualization <merlion.plot>`,
14+
including :doc:`visualization <merlion.plot>`,
1515
anomaly score :ref:`calibration <merlion.post_process.calibrate>` to improve interpetability,
1616
:doc:`AutoML <merlion.models.automl>` for hyperparameter tuning and model selection,
1717
and :doc:`model ensembling <merlion.models.ensemble>`.
@@ -44,14 +44,14 @@ Note the following external dependencies:
4444
If using ``conda``, please ``conda install -c conda-forge lightgbm``
4545
before installing our package. This will ensure that OpenMP is configured to work with the ``lightgbm`` package
4646
(one of our dependencies) in your ``conda`` environment.
47-
If using Mac, please install `Homebrew <https://brew.sh/>`_ and call ``brew install libomp`` so that the
47+
If using Mac, please install `Homebrew <https://brew.sh/>`__ and call ``brew install libomp`` so that the
4848
OpenMP libary is available for the model.
4949
This is relevant for the
5050
:py:class:`LGBMForecaster <merlion.models.forecast.trees.LGBMForecaster>`,
5151
which is also used as a part of the :py:class:`DefaultForecaster <merlion.models.defaults.DefaultForecaster>`.
5252

5353
2. Some of our anomaly detection models depend on having the Java Development Kit (JDK) installed. For Ubuntu, call
54-
``sudo apt-get install openjdk-11-jdk``. For Mac OS, install `Homebrew <https://brew.sh/>`_ and call
54+
``sudo apt-get install openjdk-11-jdk``. For Mac OS, install `Homebrew <https://brew.sh/>`__ and call
5555
``brew tap adoptopenjdk/openjdk && brew install --cask adoptopenjdk11``. Also ensure that ``java`` can be found
5656
on your ``PATH``, and that the ``JAVA_HOME`` environment variable is set.
5757
This is relevant for the :py:class:`RandomCutForest <merlion.models.anomaly.random_cut_forest.RandomCutForest>`

docs/source/merlion.evaluate.rst

+6-9
Original file line numberDiff line numberDiff line change
@@ -13,27 +13,24 @@ of time series models on different tasks.
1313
anomaly
1414
forecast
1515

16-
Submodules
17-
----------
18-
19-
merlion.evaluate.base module
20-
----------------------------
16+
merlion.evaluate.base
17+
---------------------
2118

2219
.. automodule:: merlion.evaluate.base
2320
:members:
2421
:undoc-members:
2522
:show-inheritance:
2623

27-
merlion.evaluate.anomaly module
28-
-------------------------------
24+
merlion.evaluate.anomaly
25+
------------------------
2926

3027
.. automodule:: merlion.evaluate.anomaly
3128
:members:
3229
:undoc-members:
3330
:show-inheritance:
3431

35-
merlion.evaluate.forecast module
36-
--------------------------------
32+
merlion.evaluate.forecast
33+
-------------------------
3734

3835
.. automodule:: merlion.evaluate.forecast
3936
:members:

docs/source/merlion.models.anomaly.change_point.rst

+4-7
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
1-
merlion.models.anomaly.change\_point package
2-
============================================
1+
anomaly.change\_point
2+
=====================
33

44
.. automodule:: merlion.models.anomaly.change_point
55
:members:
@@ -9,11 +9,8 @@ merlion.models.anomaly.change\_point package
99
.. autosummary::
1010
bocpd
1111

12-
Submodules
13-
----------
14-
15-
merlion.models.anomaly.change\_point.bocpd module
16-
-------------------------------------------------
12+
anomaly.change\_point.bocpd
13+
---------------------------
1714

1815
.. automodule:: merlion.models.anomaly.change_point.bocpd
1916
:members:

docs/source/merlion.models.anomaly.forecast_based.rst

+16-19
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
1-
merlion.models.anomaly.forecast\_based package
2-
==============================================
1+
anomaly.forecast\_based
2+
=======================
33

44
.. automodule:: merlion.models.anomaly.forecast_based
55
:members:
@@ -15,59 +15,56 @@ merlion.models.anomaly.forecast\_based package
1515
lstm
1616
mses
1717

18-
Submodules
19-
----------
20-
21-
merlion.models.anomaly.forecast\_based.base module
22-
--------------------------------------------------
18+
anomaly.forecast\_based.base
19+
----------------------------
2320

2421
.. automodule:: merlion.models.anomaly.forecast_based.base
2522
:members:
2623
:undoc-members:
2724
:show-inheritance:
2825

29-
merlion.models.anomaly.forecast\_based.arima module
30-
---------------------------------------------------
26+
anomaly.forecast\_based.arima
27+
-----------------------------
3128

3229
.. automodule:: merlion.models.anomaly.forecast_based.arima
3330
:members:
3431
:undoc-members:
3532
:show-inheritance:
3633

37-
merlion.models.anomaly.forecast\_based.sarima module
38-
----------------------------------------------------
34+
anomaly.forecast\_based.sarima
35+
------------------------------
3936

4037
.. automodule:: merlion.models.anomaly.forecast_based.sarima
4138
:members:
4239
:undoc-members:
4340
:show-inheritance:
4441

45-
merlion.models.anomaly.forecast\_based.ets module
46-
-------------------------------------------------
42+
anomaly.forecast\_based.ets
43+
---------------------------
4744

4845
.. automodule:: merlion.models.anomaly.forecast_based.ets
4946
:members:
5047
:undoc-members:
5148
:show-inheritance:
5249

53-
merlion.models.anomaly.forecast\_based.prophet module
54-
-----------------------------------------------------
50+
anomaly.forecast\_based.prophet
51+
-------------------------------
5552

5653
.. automodule:: merlion.models.anomaly.forecast_based.prophet
5754
:members:
5855
:undoc-members:
5956
:show-inheritance:
6057

61-
merlion.models.anomaly.forecast\_based.lstm module
62-
--------------------------------------------------
58+
anomaly.forecast\_based.lstm
59+
----------------------------
6360

6461
.. automodule:: merlion.models.anomaly.forecast_based.lstm
6562
:members:
6663
:undoc-members:
6764
:show-inheritance:
6865

69-
merlion.models.anomaly.forecast\_based.mses module
70-
--------------------------------------------------
66+
anomaly.forecast\_based.mses
67+
----------------------------
7168

7269
.. automodule:: merlion.models.anomaly.forecast_based.mses
7370
:members:

0 commit comments

Comments
 (0)