Skip to content

Commit 73a5091

Browse files
authored
Merge pull request #326 from iiasa/project/ssp/static-data
Adjust usage of message-static-data for IEA, SSP inputs
2 parents 175dedc + a8ecbbb commit 73a5091

23 files changed

+419
-309
lines changed

.github/workflows/pytest.yaml

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -110,6 +110,23 @@ jobs:
110110
with:
111111
ref: ${{ needs.check.outputs.ref }}
112112

113+
- uses: webfactory/[email protected]
114+
with:
115+
ssh-private-key: |
116+
${{ secrets.MESSAGE_STATIC_DATA_PRIVATE_KEY }}
117+
118+
- name: Check out message-static-data
119+
uses: actions/checkout@v4
120+
with:
121+
repository: iiasa/message-static-data
122+
path: message-static-data
123+
ssh-key: ${{ secrets.MESSAGE_STATIC_DATA_PRIVATE_KEY }}
124+
lfs: true
125+
# Only check out the following directories, in order to limit bandwith usage:
126+
sparse-checkout: |
127+
iea/eei
128+
ssp
129+
113130
- name: Set up uv, Python
114131
uses: astral-sh/setup-uv@v5
115132
with:
@@ -167,6 +184,9 @@ jobs:
167184
mkdir -p message-local-data/cache
168185
mix-models config set "message local data" "$(realpath message-local-data)"
169186
mix-models config show
187+
# Symlink message-static-data into local data path
188+
mkdir -p ${{ github.workspace }}/message-local-data
189+
cp -rsv $(realpath message-static-data)/* message-local-data/
170190
171191
- name: Run test suite using pytest
172192
run: |

.gitignore

Lines changed: 1 addition & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -151,9 +151,5 @@ cache/
151151
# Temporary Excel files
152152
*~$*
153153

154-
# SSP related files (not ready for public)
155-
SSP-Review-Phase-1.xlsx
156-
message_ix_models/data/ssp/*
157-
158154
# Scratch files
159-
*scratch*
155+
*scratch*

MANIFEST.in

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,14 +6,15 @@ prune .github
66
prune message_ix_models/data/test/advance
77
prune message_ix_models/data/test/gea
88
prune message_ix_models/data/test/iea
9+
prune message_ix_models/data/test/report
910
prune message_ix_models/data/test/shape
1011
prune message_ix_models/data/test/snapshot-*
1112
prune message_ix_models/data/test/ssp
13+
prune message_ix_models/data/test/transport
1214

1315
# Larger package data
1416
# - Not distributed on PyPI.
1517
# - Should be fetched with Pooch from GitHub.
16-
exclude message_ix_models/data/ssp/*.gz
1718
prune message_ix_models/data/water/*
1819
prune message_ix_models/data/material/*
1920
exclude message_ix_models/data/water/*.tar.xz

doc/api/util.rst

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -82,7 +82,6 @@ Commonly used:
8282

8383
.. automodule:: message_ix_models.util.config
8484
:members:
85-
:exclude-members: Config
8685

8786
:mod:`.util.context`
8887
====================

doc/data.rst

Lines changed: 271 additions & 175 deletions
Large diffs are not rendered by default.

doc/howto/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@ Each HOWTO is contained on its own page:
1717

1818
Run a baseline model <quickstart>
1919
Run mix-models on UnICC <unicc>
20+
path
2021
migrate
2122
Release message-ix-models <release>
2223

doc/howto/path.rst

Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,57 @@
1+
Work with paths to files and data
2+
*********************************
3+
4+
This HOWTO contains some code examples that may help with following the requirements on :doc:`data </data>`.
5+
6+
.. contents::
7+
:local:
8+
9+
.. _howto-static-to-local:
10+
11+
Connect static data to local data
12+
=================================
13+
14+
Use the :program:`-rs` options to the :program:`cp` command:
15+
16+
.. code-block:: shell
17+
18+
git clone [email protected]:iiasa/message-static-data.git
19+
cd /path/to/message-local-data
20+
cp -rsv /path/to/message-static-data ./
21+
22+
This recursively creates subdirectories in the :ref:`local data <local-data>` directory that mirror those existing in :ref:`message-static-data <static-data>`,
23+
and creates a symlink to every file in every directory.
24+
Code that looks within the local data directory will then be able to locate these files.
25+
26+
If needed, delete the directories in message-local-data and repeat the :program:`cp` call to recreate.
27+
28+
Identify the cache path used on the current system
29+
==================================================
30+
31+
.. code-block:: python
32+
33+
from message_ix_models.util.config import Config
34+
35+
cfg = Config()
36+
37+
print(cfg.cache_path)
38+
39+
Identify the cache path without a Context
40+
=========================================
41+
42+
Internal code that cannot access a :class:`~.util.config.Config` or :class:`.Context` instance
43+
**should** instead use :func:`platformdirs.user_cache_path` directly:
44+
45+
.. code-block:: python
46+
47+
from platformdirs import user_cache_path
48+
49+
# Always use "message-ix-models" as the `appname` parameter
50+
ucp = user_cache_path("message-ix-models")
51+
52+
# Construct the sub-directory for the current module
53+
dir_ = ucp.joinpath("my-project", "subdir")
54+
dir_.mkdir(parents=True, exist_ok=True)
55+
56+
# Construct a file path within this directory
57+
p = dir_.joinpath("data-file-name.csv")

doc/whatsnew.rst

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -86,7 +86,7 @@ Update :doc:`/transport/index` (:pull:`259`, :pull:`289`, :pull:`300`).
8686
Documentation
8787
-------------
8888

89-
- New :doc:`/howto/index` documentation sub-tree (:pull:`291`).
89+
- New :doc:`/howto/index` documentation sub-tree (:pull:`291`, :pull:`326`).
9090
- New guide on HOWTO :doc:`/howto/unicc` (:pull:`279`) and supporting command :program:`mix-models sbatch` in :mod:`.util.slurm` (:pull:`291`).
9191
- New summary pages for projects
9292
:doc:`project/alps`,
@@ -105,6 +105,7 @@ Documentation
105105
:doc:`project/sparccle`, and
106106
:doc:`project/uptake` (:pull:`282`, :pull:`312`).
107107
- Expand the :ref:`costs-usage` section of the :mod:`.tools.costs` documentation to describe the requirement for SSP input data (:issue:`313`, :pull:`322`).
108+
- Reorganize and improve the :doc:`data` documentation page (:pull:`326`).
108109

109110
v2025.1.10
110111
==========

message_ix_models/data/ssp/1706548837040-ssp_basic_drivers_release_3.0_full.csv.gz

Lines changed: 0 additions & 3 deletions
This file was deleted.

message_ix_models/data/ssp/1710759470883-ssp_basic_drivers_release_3.0.1_full.csv.gz

Lines changed: 0 additions & 3 deletions
This file was deleted.

message_ix_models/project/ssp/data.py

Lines changed: 26 additions & 39 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,8 @@
11
import logging
22

3-
from platformdirs import user_cache_path
4-
53
from message_ix_models.tools.exo_data import ExoDataSource, register_source
64
from message_ix_models.tools.iamc import iamc_like_data_for_query
7-
from message_ix_models.util import package_data_path, private_data_path
5+
from message_ix_models.util import path_fallback
86

97
__all__ = [
108
"SSPOriginal",
@@ -13,6 +11,18 @@
1311

1412
log = logging.getLogger(__name__)
1513

14+
#: :py:`where` argument to :func:`path_fallback`, used by both :class:`.SSPOriginal` and
15+
#: :class:`.SSPUpdate`. In order:
16+
#:
17+
#: 1. Currently data is stored in message-static-data, cloned and linked from within the
18+
#: user's 'local' data directory.
19+
#: 2. Previously some files were stored directly within message_ix_models (available in
20+
#: an editable install from a clone of the git repository, 'package') or in
21+
#: :mod:`message_data` ('private'). These settings are only provided for backward
22+
#: compatibility.
23+
#: 3. If the above are not available, use the fuzzed/random test data ('test').
24+
WHERE = "local package private test"
25+
1626

1727
@register_source
1828
class SSPOriginal(ExoDataSource):
@@ -93,25 +103,17 @@ def __init__(self, source, source_kw):
93103

94104
self.raise_on_extra_kw(source_kw)
95105

106+
# Identify input data path
107+
self.path = path_fallback("ssp", self.filename, where=WHERE)
108+
if "test" in self.path.parts:
109+
log.warning(f"Read random data from {self.path}")
110+
96111
# Assemble a query string
97112
extra = "d" if ssp_id == "4" and model == "IIASA-WiC POP" else ""
98113
self.query = (
99114
f"SCENARIO == 'SSP{ssp_id}{extra}_v9_{date}' and VARIABLE == '{measure}'"
100115
+ (f" and MODEL == '{model}'" if model else "")
101116
)
102-
# log.debug(query)
103-
104-
# Iterate over possible locations for the data file
105-
dirs = [private_data_path("ssp"), package_data_path("test", "ssp")]
106-
for path in [d.joinpath(self.filename) for d in dirs]:
107-
if not path.exists():
108-
log.info(f"Not found: {path}")
109-
continue
110-
if "test" in path.parts:
111-
log.warning(f"Reading random data from {path}")
112-
break
113-
114-
self.path = path
115117

116118
def __call__(self):
117119
# Use prepared path, query, and replacements
@@ -132,7 +134,7 @@ class SSPUpdate(ExoDataSource):
132134
133135
- `source`: Any value from :data:`.SSP_2024` or equivalent string, for instance
134136
"ICONICS:SSP(2024).2".
135-
- `release`: One of "3.0.1", "3.0", or "preview".
137+
- `release`: One of "3.1", "3.0.1", "3.0", or "preview".
136138
137139
Example
138140
-------
@@ -151,6 +153,7 @@ class SSPUpdate(ExoDataSource):
151153
filename = {
152154
"3.0": "1706548837040-ssp_basic_drivers_release_3.0_full.csv.gz",
153155
"3.0.1": "1710759470883-ssp_basic_drivers_release_3.0.1_full.csv.gz",
156+
"3.1": "1721734326790-ssp_basic_drivers_release_3.1_full.csv.gz",
154157
"preview": "SSP-Review-Phase-1.csv.gz",
155158
}
156159

@@ -183,13 +186,7 @@ def __init__(self, source, source_kw):
183186
models = []
184187
scenarios = []
185188

186-
if release in ("3.0.1", "3.0"):
187-
# Directories in which to locate `self.filename`:
188-
# - User's local cache (retrieved with "mix-models fetch" or equivalent).
189-
# - Stored directly within message_ix_models (editable install from a clone
190-
# of the git repository).
191-
dirs = [user_cache_path("message-ix-models"), package_data_path("ssp")]
192-
189+
if release in ("3.1", "3.0.1", "3.0"):
193190
scenarios.append(f"SSP{ssp_id}")
194191

195192
if measure == "GDP|PPP":
@@ -202,9 +199,6 @@ def __init__(self, source, source_kw):
202199
Scenario={"Historical Reference": scenarios[0]},
203200
)
204201
elif release == "preview":
205-
# Look first in message_data, then in message_ix_models test data
206-
dirs = [private_data_path("ssp"), package_data_path("test", "ssp")]
207-
208202
models.extend([model] if model is not None else [])
209203
scenarios.append(f"SSP{ssp_id} - Review Phase 1")
210204
else:
@@ -214,22 +208,15 @@ def __init__(self, source, source_kw):
214208
)
215209
raise ValueError(release)
216210

211+
# Identify input data path
212+
self.path = path_fallback("ssp", self.filename[release], where=WHERE)
213+
if "test" in self.path.parts:
214+
log.warning(f"Read random data from {self.path}")
215+
217216
# Assemble and store a query string
218217
self.query = f"Scenario in {scenarios!r} and Variable == '{measure}'" + (
219218
f"and Model in {models!r}" if models else ""
220219
)
221-
# log.info(f"{self.query = }")
222-
223-
# Iterate over possible locations for the data file
224-
for path in [d.joinpath(self.filename[release]) for d in dirs]:
225-
if not path.exists():
226-
log.info(f"Not found: {path}")
227-
continue
228-
if "test" in path.parts:
229-
log.warning(f"Reading random data from {path}")
230-
break
231-
232-
self.path = path
233220

234221
def __call__(self):
235222
# Use prepared path, query, and replacements

message_ix_models/testing/__init__.py

Lines changed: 13 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -142,10 +142,10 @@ def session_context(pytestconfig, tmp_env):
142142
)
143143

144144
# Store current .util.config.Config.local_data setting from the user's configuration
145-
pytestconfig.user_local_data = ctx.core.local_data
145+
uld = pytestconfig.user_local_data = ctx.core.local_data
146146

147147
# Other local data in the temporary directory for this session only
148-
ctx.core.local_data = session_tmp_dir
148+
sld = ctx.core.local_data = session_tmp_dir
149149

150150
# Also set the "message local data" key in the ixmp config
151151
ixmp_config.set("message local data", session_tmp_dir)
@@ -158,6 +158,17 @@ def session_context(pytestconfig, tmp_env):
158158
# Create some subdirectories
159159
util.MESSAGE_DATA_PATH.joinpath("data", "tests").mkdir(parents=True)
160160

161+
# Symlink some paths from the user's local data into parallel subpaths of the test
162+
# local data directory
163+
for parts in (
164+
("iea",),
165+
("ssp",),
166+
):
167+
target = uld.joinpath(*parts)
168+
path = sld.joinpath(*parts)
169+
log.info(f"Symlink {path} -> {target}")
170+
path.symlink_to(target)
171+
161172
# Add a platform connected to an in-memory database
162173
platform_name = "message-ix-models"
163174
ixmp_config.add_platform(

message_ix_models/tests/project/ssp/test_transport.py

Lines changed: 11 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,6 @@
1111
process_file,
1212
)
1313
from message_ix_models.testing import MARK
14-
from message_ix_models.tests.tools.iea.test_web import user_local_data # noqa: F401
1514
from message_ix_models.tools.iea import web
1615
from message_ix_models.util import package_data_path
1716

@@ -105,28 +104,25 @@ def _to_long(df):
105104
)
106105

107106
# Identify the directory from which IEA EWEB data is read
108-
iea_data_dir = web.dir_fallback(web.FILES[("IEA", "2024")][0])
109-
# True if the full data set is present; False if the fuzzed test data are being used
110-
full_iea_eweb_data = not (
111-
iea_data_dir.parts[-4:] == ("message_ix_models", "data", "test", "iea")
112-
)
107+
iea_eweb_dir = web.dir_fallback(web.FILES[("IEA", "2024")][0])
108+
# True if the fuzzed test data are being used
109+
iea_eweb_test_data = iea_eweb_dir.match("message_ix_models/data/test/iea/web")
113110

114111
# Number of modified values
115112
N_exp = {
116-
(METHOD.A, True): 10280,
117113
(METHOD.A, False): 10280,
118-
(METHOD.B, True): 4660,
119-
(METHOD.B, False): 3060,
120-
(METHOD.C, True): 3220,
114+
(METHOD.A, True): 10280,
115+
(METHOD.B, False): 4660,
116+
(METHOD.B, True): 3060,
121117
(METHOD.C, False): 3220,
122-
}[(method, full_iea_eweb_data)]
118+
(METHOD.C, True): 3220,
119+
}[(method, iea_eweb_test_data)]
123120

124121
if N_exp != len(df):
125122
# df.to_csv("debug-diff.csv") # DEBUG Dump to file
126123
# print(df.to_string(max_rows=50)) # DEBUG Show in test output
127-
assert N_exp == len(df), (
128-
f"Unexpected number of modified values: {len(df)} != {N_exp}"
129-
)
124+
msg = f"Unexpected number of modified values: {N_exp} != {len(df)}"
125+
assert N_exp == len(df)
130126

131127
# All of the expected 'variable' codes have been modified
132128
assert expected_variables(OUT, method) == set(df["Variable"].unique())
@@ -135,7 +131,7 @@ def _to_long(df):
135131
if len(cond):
136132
msg = "Negative emissions totals after processing"
137133
print(f"\n{msg}:", cond.to_string(), sep="\n")
138-
assert not full_iea_eweb_data, msg
134+
assert iea_eweb_test_data, msg # Negative values → fail if NOT using test data
139135

140136

141137
def expected_variables(flag: int, method: METHOD) -> set[str]:
@@ -220,7 +216,6 @@ def test_get_scenario_code(expected_id, model_name, scenario_name) -> None:
220216

221217

222218
@get_computer.minimum_version
223-
# @pytest.mark.usefixtures("user_local_data")
224219
@pytest.mark.parametrize("method", METHOD_PARAM)
225220
def test_process_df(test_context, input_csv_path, method) -> None:
226221
df_in = pd.read_csv(input_csv_path)
@@ -233,7 +228,6 @@ def test_process_df(test_context, input_csv_path, method) -> None:
233228

234229

235230
@get_computer.minimum_version
236-
# @pytest.mark.usefixtures("user_local_data")
237231
@pytest.mark.parametrize("method", METHOD_PARAM)
238232
def test_process_file(tmp_path, test_context, input_csv_path, method) -> None:
239233
"""Code can be called from Python."""

0 commit comments

Comments
 (0)