Conversation
…dule under 'external/nanochat'.
- Resolved a RuntimeError caused by non-contiguous tensors during view operations (in nanochat - gpt.py): "view size is not compatible with input tensor's size and stride...". Replaced .view() with .reshape()
- Resolved an issue where the configuration requested 'train_loss' in the results, but the server's get_logged_items() did not include it.
- To avoid vocabulary size mismatch between model and tokenizer during CORE evaluation.
- Updated log message from "global accuracy" to "Average Centered CORE benchmark metric" - Used ruff to format code
…ORE metadata so ty check is clean again.
- Added instructions for initializing submodules and resolving maturin build failure.
- Included configurations for both pre-trained and custom modes.
✅ Deploy Preview for platodocs canceled.
|
- Used Open-Meteo Archive API for hourly inputs. - Interpolated to 5-min resolution with a linear method. - Added TOML config files (tunable for better results). - Formatted code with ruff.
✅ Deploy Preview for platodocs ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
| ) | ||
| logging.info( | ||
| "Location: lat=%.2f, lon=%.2f, historical_days=%d", | ||
| latitude, |
Check failure
Code scanning / CodeQL
Clear-text logging of sensitive information High
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 4 days ago
In general, the fix is to avoid logging sensitive data such as raw geographic coordinates. Instead, log a non‑sensitive label or a redacted/generalized form that still provides observability without exposing private information.
Concretely for plato/datasources/openmeteo.py, we should change the logging.info call that currently logs lat=%.2f, lon=%.2f, historical_days=%d with latitude, longitude, and historical_days. The simplest safe approach that preserves intent is to stop logging the numeric coordinates and keep only non‑sensitive context such as location_name (already logged in the previous logging.info) and historical_days. For example, we can log "Location configuration: historical_days=%d" or "Location configuration: name=%s, historical_days=%d" using location_name instead of coordinates. This keeps functionality identical; only the log message changes.
No new imports or helper methods are required; we just modify the existing log statement in that file/region.
| @@ -142,9 +142,8 @@ | ||
| task_config["description"], | ||
| ) | ||
| logging.info( | ||
| "Location: lat=%.2f, lon=%.2f, historical_days=%d", | ||
| latitude, | ||
| longitude, | ||
| "Location configuration: name=%s, historical_days=%d", | ||
| location_name, | ||
| historical_days, | ||
| ) | ||
| logging.info("Variables: %s", ", ".join(variables)) |
| logging.info( | ||
| "Location: lat=%.2f, lon=%.2f, historical_days=%d", | ||
| latitude, | ||
| longitude, |
Check failure
Code scanning / CodeQL
Clear-text logging of sensitive information High
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 4 days ago
In general, to fix clear-text logging of sensitive information, either stop logging the sensitive fields entirely, or sanitize them so that only non-sensitive/less sensitive derivatives (e.g., coarse-grained, masked, or redacted values) are logged. The rest of the functionality (in this case, fetching weather data based on actual coordinates) should continue to use the full-precision values; only the log output should change.
Here, the best minimal fix is to avoid logging the raw latitude and longitude in clear text while preserving useful diagnostic context. We can do this by:
- Removing
latitudeandlongitudefrom the formatted log line, and instead - Logging only non-sensitive, high-level information, such as
location_name,historical_days, and the selectedtask_type/description; or - If coordinates are still desired for debugging, logging a coarse/rounded or redacted version (e.g., to the nearest whole degree or replacing them with
[REDACTED]).
To keep changes minimal and avoid assumptions about what is sensitive, I will treat the numeric coordinates as sensitive and remove them from the log message, while still logging historical_days. Concretely, in plato/datasources/openmeteo.py:
- Locate the
logging.infocall around lines 144–149 that logs"Location: lat=%.2f, lon=%.2f, historical_days=%d"withlatitude,longitude,historical_days. - Replace it with a log line that does not include
latitudeorlongitudein clear text, for example:"Location configured: historical_days=%d"or"Location configured for %s: historical_days=%d"usinglocation_nameandhistorical_days. - No new imports or helper functions are needed; we only change the string and arguments of the existing log call.
This change ensures that the tainted longitude (and latitude) no longer flow into the logging sink, addressing all alert variants referencing that call, while leaving how the coordinates are used elsewhere untouched.
| @@ -142,9 +142,8 @@ | ||
| task_config["description"], | ||
| ) | ||
| logging.info( | ||
| "Location: lat=%.2f, lon=%.2f, historical_days=%d", | ||
| latitude, | ||
| longitude, | ||
| "Location configured for %s: historical_days=%d", | ||
| location_name, | ||
| historical_days, | ||
| ) | ||
| logging.info("Variables: %s", ", ".join(variables)) |
| ) -> str: | ||
| """Generate a unique cache key based on request parameters.""" | ||
| key_string = f"{latitude}_{longitude}_{start_date}_{end_date}_{'_'.join(sorted(variables))}_{target_freq}" | ||
| return hashlib.md5(key_string.encode()).hexdigest() |
Check failure
Code scanning / CodeQL
Use of a broken or weak cryptographic hashing algorithm on sensitive data High
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 4 days ago
In general, to fix this kind of issue you should avoid MD5 (and other broken hashes like SHA‑1) when hashing potentially sensitive data, even if only for identifiers. Instead, use a modern, collision-resistant hash function such as SHA‑256 (for general hashing) or a dedicated password hashing scheme for credentials. For non-security uses like cache keys, SHA‑256 is a drop‑in replacement for MD5.
The single best fix here is to change _generate_cache_key in plato/utils/openmeteo_api.py to use hashlib.sha256 instead of hashlib.md5. This preserves the behavior (a deterministic hex string derived from the same input) but uses a strong hash. No other logic needs to change, and all callers will continue to work since the function still returns a hex string. We should also keep the hashlib import, since we are still using it.
Concretely:
- In
plato/utils/openmeteo_api.py, update line 29:- From:
return hashlib.md5(key_string.encode()).hexdigest() - To:
return hashlib.sha256(key_string.encode()).hexdigest()
- From:
- No changes are required in
plato/datasources/openmeteo.pyor elsewhere. - No new imports or helper methods are needed;
hashlib.sha256is part of the standard library and already available via the existingimport hashlib.
| @@ -26,7 +26,7 @@ | ||
| ) -> str: | ||
| """Generate a unique cache key based on request parameters.""" | ||
| key_string = f"{latitude}_{longitude}_{start_date}_{end_date}_{'_'.join(sorted(variables))}_{target_freq}" | ||
| return hashlib.md5(key_string.encode()).hexdigest() | ||
| return hashlib.sha256(key_string.encode()).hexdigest() | ||
|
|
||
|
|
||
| def _get_cache_path(cache_dir: Path, cache_key: str) -> Path: |
This PR introduces support for the PatchTSMixer model in the Plato federated learning framework for time series forecasting tasks.
Description
Specifically, this PR:
How has this been tested?
Quick check evaluation:
This configuration runs only 3 rounds, which is useful for quick functional tests and CORE-style checks. The run completed successfully without runtime errors.
Longer training run:
This configuration uses more rounds. After 400 rounds, the MSE dropped from 7.14 to around 1.30, indicating that the model and data pipeline are working as expected.
Types of changes
Checklist:
ruff format) and checked using the Ruff linter (ruff check --fix).