Skip to content

[WIP]rewrite workflow implementation. Add prefect framework. prepare for paper. #89

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 345 commits into from

Conversation

felix5572
Copy link
Contributor

@felix5572 felix5572 commented Mar 15, 2025

rewrite workflow implementation. Add prefect framework. prepare for paper.

Summary by CodeRabbit

  • New Features

    • Introduced a comprehensive simulation framework supporting thermodynamic integration and free energy calculations, including molecular dynamics workflows for liquids, water, and ice.
    • Added command-line interfaces and enhanced configuration options for generating, executing, refining, and analyzing simulation tasks.
    • Developed advanced visualization tools for Gibbs free energy surfaces and comparative analysis of simulation results.
    • Established robust infrastructure for dependency injection, job execution management, file handling, and report generation.
    • Integrated Prefect workflow orchestration with caching, task decorators, and runtime context management.
    • Provided modular workflow components for simulation setup, execution, and result extraction with strong typing and error handling.
  • Tests & Examples

    • Delivered sample workflows, example configurations, and testing routines to demonstrate and validate the new features.
    • Included example molecular dynamics data files and deployment scripts for practical usage.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

🔭 Outside diff range comments (2)
dpti/workflows/examples/Sn_SCAN_high_pressure/wget_graph.pb.sh (1)

1-2: 🛠️ Refactor suggestion

Add a shebang line and improve error handling.

The script is missing a proper shebang line to specify the shell interpreter. Also, consider adding basic error handling and checking if the file already exists before downloading.

+#!/bin/bash
+
+# Exit on error
+set -e
+
+# Check if file already exists
+if [ -f "graph.pb" ]; then
+  echo "graph.pb already exists. Skipping download."
+else
+  echo "Downloading graph.pb..."
   wget https://huggingface.co/Felix5572/Sn-SCAN-Compressed/resolve/main/graph.pb
+  echo "Download completed."
+fi
🧰 Tools
🪛 Shellcheck (0.10.0)

[error] 1-1: Tips depend on target shell and yours is unknown. Add a shebang or a 'shell' directive.

(SC2148)

tests/run_test.sh (1)

1-7: 🛠️ Refactor suggestion

Add shebang line and improve script structure.

The script is missing a proper shebang line and doesn't handle errors if any of the commands fail. Consider adding error handling and improving the structure.

+#!/bin/bash
+
+# Exit on error
+set -e
+
+echo "Running all tests..."
 pytest 
 # python -m unittest
+
+echo "Running specific file handler tests..."
 pytest workflows/service/test_file_handler.py
+
+echo "Generating coverage reports..."
 coverage report
 coverage report --show-missing
 coverage html
+
+echo "Testing completed successfully!"
🧰 Tools
🪛 Shellcheck (0.10.0)

[error] 1-1: Tips depend on target shell and yours is unknown. Add a shebang or a 'shell' directive.

(SC2148)

♻️ Duplicate comments (1)
dpti/workflows/flows/base_flow.py (1)

20-22: ⚠️ Potential issue

Fix the path join arguments.

The string '../' "../" contains a space between path segments which may lead to incorrect path resolution. The proper way to join paths would be to provide separate directory components.

- sys.path.insert(0, os.path.abspath(os.path.join(os.path.dirname(__file__), "../" "../", "../")))
+ sys.path.insert(0, os.path.abspath(os.path.join(os.path.dirname(__file__), "..", "..", "..")))
🧹 Nitpick comments (56)
dpti/workflows/examples/Water_Ice_Phase_Diagram/wget_graph.pb.sh (2)

1-1: Add a Shebang Directive

It’s recommended to include a shebang (e.g. #!/bin/bash) at the very top of the script to clearly specify the intended shell. This ensures that tools like ShellCheck can properly interpret the script.

Proposed diff:

+#!/bin/bash
🧰 Tools
🪛 Shellcheck (0.10.0)

[error] 1-1: Tips depend on target shell and yours is unknown. Add a shebang or a 'shell' directive.

(SC2148)


1-1: Verify URL Consistency

The URL references “Water-Ice-SCAN-Phase-Dirgram” while the folder name uses “Diagram”. Please confirm that the URL is not inadvertently misspelled.

🧰 Tools
🪛 Shellcheck (0.10.0)

[error] 1-1: Tips depend on target shell and yours is unknown. Add a shebang or a 'shell' directive.

(SC2148)

dpti/workflows/examples/Water_Ice_Phase_Diagram/ti_ice04_150K_10000bar_settings.yaml (4)

1-13: Ensure Consistency of Model File Naming

In the thermo_input section, the model is set to "graph.pb". Please ensure that this name exactly matches the file downloaded by wget_graph.pb.sh so that the workflow correctly locates and uses the intended model.


14-28: NPT Simulation Settings Suggestions

The NPT simulation parameters are clearly defined. As a minor note, for fields like if_meam that are set to null, consider using explicit boolean values (e.g. false) if that matches the expected input for your simulation engine. This can help prevent any unexpected type conversion issues.


46-84: HTI Simulation Settings Review

The HTI simulation block details a multi-step switching mechanism and associated lambda sequences along with soft parameters. Ensure that these parameters are in line with the simulation requirements and have been cross-checked against documented expected values.


85-97: TI Simulation Settings Verification

The TI simulation settings use null defaults (e.g. for equi_conf and ens) to indicate fallbacks from the thermo_input block. Confirm that the simulation engine handles these null values appropriately, and if possible, clarify their usage in the documentation.

tests/workflows/flows/test_basemodel.py (3)

1-9: Update imports to use modern type hints.

The imports are well organized, but there are some deprecated typing annotations that should be updated:

- from typing import Any, Dict, Union
+ from typing import Any, Union

Python 3.9+ supports using built-in types like dict directly in type hints instead of importing from typing.

🧰 Tools
🪛 Ruff (0.8.2)

4-4: typing.Dict is deprecated, use dict instead

(UP035)


11-24: Model structure looks good, but type annotation needs updating.

The SimulationConfig class has well-defined fields with appropriate types and descriptions, but there's a type annotation that should be updated:

-    some_yaml: Union[Path, str, Dict[str, Any], BaseModel] = Field(
+    some_yaml: Union[Path, str, dict[str, Any], BaseModel] = Field(

Use the built-in dict type rather than the deprecated Dict from typing module.

🧰 Tools
🪛 Ruff (0.8.2)

17-17: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


64-79: Internationalize workflow function and improve documentation.

The workflow function uses Chinese comments which should be translated for consistency across the codebase:

@flow(log_prints=True)
def free_energy_workflow(thermo_input: SimulationConfig):
-    """一个简单的工作流,接收SimulationConfig并打印其内容."""
+    """A simple workflow that receives SimulationConfig and prints its contents."""
-    print("开始模拟计算,参数如下:")
+    print("Starting simulation calculation with parameters:")
-    print(f"温度: {thermo_input.temperature} K")
-    print(f"压力: {thermo_input.pressure} bar")
-    print(f"模拟步数: {thermo_input.simulation_steps}")
+    print(f"Temperature: {thermo_input.temperature} K")
+    print(f"Pressure: {thermo_input.pressure} bar")
+    print(f"Simulation steps: {thermo_input.simulation_steps}")

-    # 在实际应用中,这里会有更多的计算逻辑
+    # In real applications, there would be more calculation logic here
    result = thermo_input.temperature * thermo_input.pressure / 100
-    print(f"模拟计算结果: {result}")
+    print(f"Simulation result: {result}")
    print(f"thermo_input: {thermo_input=}")
    print(f"temp_str: {thermo_input.temp_str=}")

    return {"thermo_input": thermo_input.model_dump(), "result": result}
🧰 Tools
🪛 Ruff (0.8.2)

66-66: Docstring contains ambiguous (FULLWIDTH COMMA). Did you mean , (COMMA)?

(RUF002)


67-67: String contains ambiguous (FULLWIDTH COMMA). Did you mean , (COMMA)?

(RUF001)


72-72: Comment contains ambiguous (FULLWIDTH COMMA). Did you mean , (COMMA)?

(RUF003)

.cursor/rules/dpti-default-ai-rules.mdc (4)

15-15: Fix formatting issue in goal statement.

There appears to be a formatting issue with 高#效 where it should be 高效.

-你的目标是以用户容易理解的方式帮助他们完成分子动力学自由能计算应用的设计和开发工作,确保计算结果准确可靠、性能高#效、用户使用便捷。
+你的目标是以用户容易理解的方式帮助他们完成分子动力学自由能计算应用的设计和开发工作,确保计算结果准确可靠、性能高效、用户使用便捷。

58-62: Fix numbering inconsistency in methodology section.

The numbering in the methodology section is inconsistent, with the last point using "2" when it should be "4".

 # 方法论
 1.Unix编程风格,开源软件设计风格,恰当根据项目或者模块需要,使用设计模式
 2.坚持dont repeat your self 原则
 3.在软件架构上给出成熟考虑。
-2.回答问题时,也给出关于设计模式上的思考,提示用户相关联的最佳实践(如果有)。
+4.回答问题时,也给出关于设计模式上的思考,提示用户相关联的最佳实践(如果有)。

7-9: Remove or update commented markdown instructions.

There are commented-out markdown instructions that should either be removed or converted to proper comments if they contain valuable information for maintainers.

-<!-- - You can @ files here
-- You can use markdown but dont have to -->
+<!-- 
+Notes for maintainers:
+- You can @ files using mdc: syntax
+- Markdown formatting is optional
+-->

1-10: Consider adding an English translation for international collaborators.

Since this file contains important guidelines and rules for the project, consider adding an English translation if the project has international collaborators. This would make the documentation more accessible to a wider audience.

dpti/workflows/examples/Sn_SCAN_high_pressure/ti_line_beta144xy.yaml (2)

17-18: Document the purpose of null values in simulation settings.

The temp and pres parameters are set to null in multiple simulation settings. It would be helpful to add comments explaining that these values are derived from the thermo_input section to improve clarity.

 npt_simulation_settings:
   nsteps: 1000000
   timestep: 0.002
-  temp: null
-  pres: null
+  temp: null  # Derived from thermo_input.target_temp
+  pres: null  # Derived from thermo_input.target_pres
   tau_t: 0.2
   tau_p: 2.0

Similarly for the NVT simulation settings:

 nvt_simulation_settings:
   equi_conf: 'npt_avg.lmp'
   nsteps: 400000
   timestep: 0.002
   ens: 'nvt'  # constant, unchangeable for NVT simulation
-  temp: null
-  pres: null
+  temp: null  # Derived from thermo_input.target_temp
+  pres: null  # Not used in NVT ensemble, but included for API consistency
   tau_t: 0.2
   tau_p: 2.0

Also applies to: 34-35


67-75: Add documentation for specialized parameters in HTI simulation settings.

The spring_k and soft_param section contains specialized parameters that would benefit from additional documentation to explain their physical meaning and appropriate values.

-  spring_k: 0.02
+  spring_k: 0.02  # Spring constant for the Einstein crystal (eV/Ų)
   soft_param:
-    sigma_0_0: 2.7
-    epsilon: 0.030
-    activation: 0.5
-    n: 1.0
-    alpha_lj: 0.5
-    rcut: 6.0
+    sigma_0_0: 2.7    # Characteristic length for LJ potential (Å)
+    epsilon: 0.030    # Energy scale for LJ potential (eV)
+    activation: 0.5   # Activation parameter for soft-core potential
+    n: 1.0            # Exponent in the soft-core potential
+    alpha_lj: 0.5     # Softness parameter for LJ interactions
+    rcut: 6.0         # Cutoff radius for interactions (Å)
tests/workflows/flows/test_flowclass.py (6)

2-3: Update type annotation import

Use the built-in dict type instead of importing Dict from typing.

-from typing import Any, Dict
+from typing import Any
🧰 Tools
🪛 Ruff (0.8.2)

2-2: typing.Dict is deprecated, use dict instead

(UP035)


8-9: Replace fullwidth comma with regular comma in docstring

The docstring contains a fullwidth comma (,) which should be replaced with a regular comma (,).

-    """基础工作流类."""
+    """基础工作流类."""

17-19: Update return type annotation to use built-in dict

Replace Dict with dict in the return type annotation.

-    ) -> Dict[str, Any]:
+    ) -> dict[str, Any]:
🧰 Tools
🪛 Ruff (0.8.2)

19-19: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


26-30: Update return type annotation to use built-in dict

Replace Dict with dict in the return type annotation.

-    ) -> Dict[str, Any]:
+    ) -> dict[str, Any]:
🧰 Tools
🪛 Ruff (0.8.2)

29-29: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


37-39: Update return type annotation to use built-in dict

Replace Dict with dict in the return type annotation.

-    ) -> Dict[str, Any]:
+    ) -> dict[str, Any]:
🧰 Tools
🪛 Ruff (0.8.2)

39-39: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


45-71: Consider renaming this file to indicate its example/demo nature

This file appears to be an example or demonstration of how to use the workflow classes rather than a proper test file. Consider renaming it to something like workflow_example.py or moving it to an examples directory if it's meant as a demonstration.

tests/workflows/service/test_file_handler.py (1)

190-208: Consider uncomenting the test_subdir_context_normal method

This test method is commented out, but appears to be a valid and useful test. Consider uncommenting it to improve test coverage.

-    # def test_subdir_context_normal(self):
-    #     job_dirname = "test_job"
-    #     self.file_handler.use_job_info(job_dirname)
-    #     self.file_handler.create_job_dir()
-
-    #     subdirname = "subdir"
-    #     original_job_dir = self.file_handler.job_dir
-
-    #     with self.file_handler.subdir_context(subdirname):
-    #         self.assertEqual(self.file_handler.job_dir, os.path.join(original_job_dir, subdirname))
-    #         self.assertTrue(os.path.isdir(self.file_handler.job_dir))
-            
-    #         # test writing file in subdir
-    #         self.file_handler.write_pure_file("test_file.txt", "Hello from subdir")
-    #         self.assertTrue(os.path.exists(os.path.join(self.file_handler.job_dir, "test_file.txt")))
-
-    #     #verify the job_dir is restored to its original value after exiting the context
-    #     self.assertEqual(self.file_handler.job_dir, original_job_dir)
+    def test_subdir_context_normal(self):
+        job_dirname = "test_job"
+        self.file_handler.use_job_info(job_dirname)
+        self.file_handler.create_job_dir()
+
+        subdirname = "subdir"
+        original_job_dir = self.file_handler.job_dir
+
+        with self.file_handler.subdir_context(subdirname):
+            self.assertEqual(self.file_handler.job_dir, os.path.join(original_job_dir, subdirname))
+            self.assertTrue(os.path.isdir(self.file_handler.job_dir))
+            
+            # test writing file in subdir
+            self.file_handler.write_pure_file("test_file.txt", "Hello from subdir")
+            self.assertTrue(os.path.exists(os.path.join(self.file_handler.job_dir, "test_file.txt")))
+
+        #verify the job_dir is restored to its original value after exiting the context
+        self.assertEqual(self.file_handler.job_dir, original_job_dir)
tests/workflows/conftest.py (6)

13-14: Consider translating docstring to English for consistency

The docstring is in Chinese while the rest of the code is in English. For international collaboration and consistency, consider translating the docstring to English.

-    """创建临时工作目录."""
+    """Create temporary working directory for testing."""

19-20: Consider translating docstring to English for consistency

Similar to the previous docstring, consider translating this one to English as well.

-    """模拟工作流服务."""
+    """Mock workflow service."""

28-29: Consider translating docstring to English for consistency

The docstring should be in English for consistency.

-    """NPT模拟的示例设置."""
+    """Sample settings for NPT simulation."""

50-51: Consider translating docstring to English for consistency

The docstring should be in English for consistency.

-    """HTI模拟的示例设置."""
+    """Sample settings for HTI simulation."""

72-73: Consider translating docstring to English for consistency

The docstring should be in English for consistency.

-    """模拟Prefect上下文."""
+    """Mock Prefect context."""

83-84: Consider translating docstring to English for consistency

The docstring should be in English for consistency.

-    """示例工作流结果."""
+    """Sample workflow results."""
dpti/workflows/examples/Sn_SCAN_high_pressure/ti_line_long_liquid_1200K_10GPa_path_t_settings.yaml (1)

5-5: Consider using absolute paths or environment variables for file paths.

The configuration uses relative paths like 'liquid128.lmp' which might cause issues if the working directory changes. Consider using absolute paths or environment variables to make the configuration more robust.

dpti/workflows/flows/base_flow.py (2)

95-95: Translate non-English comments to English for better maintainability.

The codebase primarily uses English, but there are some non-English comments. For consistency and to improve maintainability for all contributors, consider translating these comments to English.

- """需要满足的最小接口要求."""
+ """Minimum interface requirements that need to be satisfied."""

- """执行工作流."""
+ """Execute the workflow."""

Also applies to: 159-160


36-68: Consider removing commented-out code blocks.

There are large blocks of commented-out code that might confuse maintainers. Consider removing them if they're no longer needed, or add explanatory comments if they're kept for reference.

tests/workflows/flows/test_flow_instance_method.py (3)

13-16: Translate method docstring to English for consistency.

The docstring "执行问候流程" should be translated to English to maintain consistency throughout the codebase.

- """执行问候流程."""
+ """Execute greeting workflow."""

18-22: Translate method docstring to English for consistency.

The docstring "实际处理问候消息的内部方法" should be translated to English to maintain consistency throughout the codebase.

- """实际处理问候消息的内部方法."""
+ """Internal method that processes the greeting message."""

40-41: Translate docstring to English for consistency.

The docstring "创建工作流实例并执行指定方法" should be translated to English to maintain consistency throughout the codebase.

- """创建工作流实例并执行指定方法."""
+ """Create workflow instance and execute the specified method."""
tests/workflows/flows/test_prefect_decorator.py (3)

7-7: Fix docstring syntax and translate to English.

The docstring contains a fullwidth comma (,) and is in non-English language. Use standard ASCII punctuation and translate to English for consistency.

- """通用的Prefect装饰器,可以装饰类或函数为task."""
+ """Generic Prefect decorator that can decorate classes or functions as tasks."""
🧰 Tools
🪛 Ruff (0.8.2)

7-7: Docstring contains ambiguous (FULLWIDTH COMMA). Did you mean , (COMMA)?

(RUF002)


97-98: Translate print statements to English for consistency.

Several print statements use non-English text. For consistency throughout the codebase, consider translating them to English.

- print("准备数据")
+ print("Preparing data")

- print("设置处理环境")
+ print("Setting up processing environment")

- print("执行数据处理")
+ print("Executing data processing")

- print("提取处理结果")
+ print("Extracting processing results")

Also applies to: 113-114, 117-119, 122-124


101-107: Validate task configurations against allowed Prefect parameters.

The decorator specifies task configurations for the class methods, but there's no validation that these configurations are valid Prefect task parameters. Consider adding validation to prevent runtime errors.

dpti/workflows/analyze/compare_ti_plot.py (2)

145-164: Replace prints with logging for better maintenance.

Multiple print statements can complicate production debugging and concurrency. Using a logging library allows configurable verbosity and better integration with external monitoring.


335-515: Minor style improvement: replace fullwidth parentheses in comments/docstrings.

Some docstrings or comments contain fullwidth punctuation (e.g., ). To maintain consistency, consider using ASCII parentheses to avoid ambiguity across environments.

-# 找到自由能差为0的温度点(相变点)
+# 找到自由能差为0的温度点(相变点)
🧰 Tools
🪛 Ruff (0.8.2)

336-336: Use list instead of List for type annotation

Replace with list

(UP006)


336-336: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


337-337: Use list instead of List for type annotation

Replace with list

(UP006)


337-337: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


340-340: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


399-399: Comment contains ambiguous (FULLWIDTH LEFT PARENTHESIS). Did you mean ( (LEFT PARENTHESIS)?

(RUF003)


399-399: Comment contains ambiguous (FULLWIDTH RIGHT PARENTHESIS). Did you mean ) (RIGHT PARENTHESIS)?

(RUF003)


416-416: Comment contains ambiguous (FULLWIDTH LEFT PARENTHESIS). Did you mean ( (LEFT PARENTHESIS)?

(RUF003)


416-416: Comment contains ambiguous (FULLWIDTH RIGHT PARENTHESIS). Did you mean ) (RIGHT PARENTHESIS)?

(RUF003)


434-434: Docstring contains ambiguous (FULLWIDTH LEFT PARENTHESIS). Did you mean ( (LEFT PARENTHESIS)?

(RUF002)


434-434: Docstring contains ambiguous (FULLWIDTH RIGHT PARENTHESIS). Did you mean ) (RIGHT PARENTHESIS)?

(RUF002)


465-465: Comment contains ambiguous (FULLWIDTH COMMA). Did you mean , (COMMA)?

(RUF003)


473-473: Comment contains ambiguous (FULLWIDTH COMMA). Did you mean , (COMMA)?

(RUF003)


473-473: Comment contains ambiguous (FULLWIDTH LEFT PARENTHESIS). Did you mean ( (LEFT PARENTHESIS)?

(RUF003)


473-473: Comment contains ambiguous (FULLWIDTH RIGHT PARENTHESIS). Did you mean ) (RIGHT PARENTHESIS)?

(RUF003)

dpti/workflows/simulations/ti_sim.py (2)

170-180: Check concurrency implications for the class-level _instances usage in TIIntegraionPath.

Although concurrency is not explicitly shown here, global or class-level caches can become a bottleneck or race condition in parallel contexts. If running in multi-threaded or multi-process mode, ensure a thread-safe or process-safe approach.

🧰 Tools
🪛 Ruff (0.8.2)

180-180: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)


195-224: Consider using structured logging instead of print.

The _prepare method logs progress using print statements. Structured logging (e.g., logging module) is recommended for better traceability and centralized log management.

🧰 Tools
🪛 Ruff (0.8.2)

195-195: Use dict instead of Dict for type annotation

Replace with dict

(UP006)

dpti/workflows/simulations/hti_sim.py (1)

1069-1133: Proactively handle unsupported steps for LiquidHTI.

The raise ValueError calls appear for "one-step" or "two-step". If new steps are introduced in the future, consider a more flexible approach or a fallback for unknown steps to guide usage.

🧰 Tools
🪛 Ruff (0.8.2)

1088-1088: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


1099-1099: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


1111-1111: Use list instead of List for type annotation

Replace with list

(UP006)

dpti/workflows/simulations/equi_sim.py (5)

28-49: Consider validation for ens and meam_model.

While ens is declared as a string and meam_model as a dictionary, you might protect against invalid simulation ensembles or malformed MEAM parameters. For instance, validating that ens is one of ["npt", "nvt"] or checking the structure of meam_model helps prevent runtime errors or subtle simulation misconfigurations.


79-86: Make if_water more descriptive or restrict its usage.

if_water is a boolean flag that potentially changes behavior during post-processing. You may consider a more descriptive name (e.g., simulate_water_phase) or additional documentation clarifying all logic that depends on this flag, reducing ambiguity for maintainers.


137-163: Check for errors when generating LAMMPS input.

If equi.gen_equi_lammps_input fails, the rest of this method silently continues. Consider adding exception handling or checks to ensure that lmp_str is not None or an empty string, so the pipeline doesn’t produce invalid input.

 try:
     lmp_str = equi.gen_equi_lammps_input(
         **lammps_input_kwargs_data.model_dump()
     )
+    if not lmp_str:
+        raise ValueError("Generated LAMMPS input is empty or invalid.")
 except Exception as e:
     raise RuntimeError(f"Failed to generate LAMMPS input: {str(e)}")
🧰 Tools
🪛 Ruff (0.8.2)

137-137: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


254-257: Reevaluate usage of if_water.

Similarly to NPTEquiSimulationSettings, if_water might benefit from either a more descriptive name or additional validation.


347-366: Align _extract with consistent error checks.

Like in NPTEquiSimulation._extract, consider verifying that info is non-empty to protect downstream usage. This ensures consistent reliability across NPT and NVT pipelines.

🧰 Tools
🪛 Ruff (0.8.2)

347-347: Use dict instead of Dict for type annotation

Replace with dict

(UP006)

dpti/workflows/analyze/build_gibbs_surface.py (5)

80-104: Confirm file patterns are correct for multi-file loads.

load_multiple_files uses glob for pattern matching. If the user provides an incorrect pattern or the directory is missing, it returns an empty list. Consider warning the user or raising an exception if no files match the pattern.

🧰 Tools
🪛 Ruff (0.8.2)

82-82: Use list instead of List for type annotation

Replace with list

(UP006)


82-82: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


155-210: Consider verifying the size of data_points.

When plotting in 3D (plot_surface), if data_points is empty, a cryptic error might arise. You may gracefully handle or warn the user if no data points exist.


268-419: Implement a threshold for numerical stability beyond 0.0001.

When identifying phase boundaries with np.abs(fe_diff) < 0.0001, consider an argument or setting to define this threshold. Hard-coding it can cause confusion or produce false boundaries if the free-energy scale changes.

🧰 Tools
🪛 Ruff (0.8.2)

272-272: Use tuple instead of Tuple for type annotation

Replace with tuple

(UP006)


273-273: Use tuple instead of Tuple for type annotation

Replace with tuple

(UP006)


528-575: Document the format of data points saved to text files.

save_data_points writes a structured layout. Consider referencing a stable column order and expected units in docstrings or code comments. This ensures that downstream users interpret the text file correctly.

🧰 Tools
🪛 Ruff (0.8.2)

533-533: Docstring contains ambiguous (FULLWIDTH COMMA). Did you mean , (COMMA)?

(RUF002)


540-540: String contains ambiguous (FULLWIDTH COMMA). Did you mean , (COMMA)?

(RUF001)


557-557: Comment contains ambiguous (FULLWIDTH COLON). Did you mean : (COLON)?

(RUF003)


557-557: Comment contains ambiguous (FULLWIDTH COMMA). Did you mean , (COMMA)?

(RUF003)


558-558: Comment contains ambiguous (FULLWIDTH COLON). Did you mean : (COLON)?

(RUF003)


558-558: Comment contains ambiguous (FULLWIDTH COMMA). Did you mean , (COMMA)?

(RUF003)


559-559: Comment contains ambiguous (FULLWIDTH COLON). Did you mean : (COLON)?

(RUF003)


559-559: Comment contains ambiguous (FULLWIDTH COMMA). Did you mean , (COMMA)?

(RUF003)


823-829: Review performance costs of high-resolution interpolation.

Using resolution=150 with cubic interpolation can be expensive for large datasets. Evaluate whether a user-provided limit or adaptive resolution is needed to prevent potential performance bottlenecks.

dpti/workflows/service/workflow_service_module.py (4)

23-25: Clarify if WorkflowServices is always injected.

WorkflowServices(Protocol) sets io_handler as a required attribute. If other code references additional attributes (e.g., job_executor), consider an extended protocol or ensure usage only with specialized classes such as BasicWorkflowServices.


32-37: Confirm that BasicWorkflowServices is used exclusively where job_executor and report_generator are needed.

Given the separation between IOWorkflowServices and BasicWorkflowServices, ensure code doesn’t inadvertently rely on a partial service. This helps maintain a clear boundary between “IO only” flows vs. those requiring job execution.

Do you want to open a new issue to track possible misuses of partial or incomplete service definitions?


39-42: Evaluate exceptions from train in RayExecutor.

If the train method fails, confirm how the exception is communicated back to the caller. You may want standardized error handling or logging for re-entrant workflows.


71-78: Check for potential concurrency issues with file handling.

LocalFileHandler is returned directly here. If multiple threads or processes are using the same handler, confirm approach for locking or concurrency control.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 70a4990 and 0dac766.

📒 Files selected for processing (25)
  • .cursor/rules/dpti-default-ai-rules.mdc (1 hunks)
  • dpti/workflows/analyze/build_gibbs_surface.py (1 hunks)
  • dpti/workflows/analyze/compare_ti_plot.py (1 hunks)
  • dpti/workflows/examples/Sn_SCAN_high_pressure/beta144xy.lmp (1 hunks)
  • dpti/workflows/examples/Sn_SCAN_high_pressure/liquid128.lmp (1 hunks)
  • dpti/workflows/examples/Sn_SCAN_high_pressure/ti_line_beta144xy.yaml (1 hunks)
  • dpti/workflows/examples/Sn_SCAN_high_pressure/ti_line_long_liquid_1200K_10GPa_path_t_settings.yaml (1 hunks)
  • dpti/workflows/examples/Sn_SCAN_high_pressure/wget_graph.pb.sh (1 hunks)
  • dpti/workflows/examples/Water_Ice_Phase_Diagram/ice04.lmp (1 hunks)
  • dpti/workflows/examples/Water_Ice_Phase_Diagram/liq_water.lmp (1 hunks)
  • dpti/workflows/examples/Water_Ice_Phase_Diagram/ti_ice04_150K_10000bar_settings.yaml (1 hunks)
  • dpti/workflows/examples/Water_Ice_Phase_Diagram/wget_graph.pb.sh (1 hunks)
  • dpti/workflows/flows/base_flow.py (1 hunks)
  • dpti/workflows/service/workflow_service_module.py (1 hunks)
  • dpti/workflows/simulations/equi_sim.py (1 hunks)
  • dpti/workflows/simulations/hti_sim.py (1 hunks)
  • dpti/workflows/simulations/ti_sim.py (1 hunks)
  • tests/run_test.sh (1 hunks)
  • tests/workflows/conftest.py (1 hunks)
  • tests/workflows/flows/test_basemodel.py (1 hunks)
  • tests/workflows/flows/test_flow_instance_method.py (1 hunks)
  • tests/workflows/flows/test_flowclass.py (1 hunks)
  • tests/workflows/flows/test_prefect_decorator.py (1 hunks)
  • tests/workflows/service/test_file_handler.py (1 hunks)
  • tests/workflows/simulations/test_hti_sim.py (1 hunks)
✅ Files skipped from review due to trivial changes (4)
  • dpti/workflows/examples/Sn_SCAN_high_pressure/liquid128.lmp
  • dpti/workflows/examples/Water_Ice_Phase_Diagram/ice04.lmp
  • dpti/workflows/examples/Sn_SCAN_high_pressure/beta144xy.lmp
  • dpti/workflows/examples/Water_Ice_Phase_Diagram/liq_water.lmp
🧰 Additional context used
🪛 Ruff (0.8.2)
tests/workflows/flows/test_flowclass.py

2-2: typing.Dict is deprecated, use dict instead

(UP035)


11-11: Docstring contains ambiguous (FULLWIDTH COMMA). Did you mean , (COMMA)?

(RUF002)


19-19: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


29-29: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


39-39: Use dict instead of Dict for type annotation

Replace with dict

(UP006)

tests/workflows/flows/test_prefect_decorator.py

7-7: Docstring contains ambiguous (FULLWIDTH COMMA). Did you mean , (COMMA)?

(RUF002)

dpti/workflows/service/workflow_service_module.py

16-16: Redefinition of unused IOHandler from line 9

(F811)


16-16: Redefinition of unused LocalFileHandler from line 9

(F811)


17-17: Redefinition of unused DpdispatcherExecutor from line 10

(F811)


17-17: Redefinition of unused JobExecutor from line 10

(F811)


18-18: Redefinition of unused PrefectReportGenerator from line 12

(F811)


18-18: Redefinition of unused ReportGenerator from line 13

(F811)

dpti/workflows/simulations/ti_sim.py

8-15: typing.Dict is deprecated, use dict instead

(UP035)


8-15: typing.List is deprecated, use list instead

(UP035)


8-15: typing.Type is deprecated, use type instead

(UP035)


38-38: Redefinition of unused FreeEnergyValuePoint from line 29

(F811)


39-39: Redefinition of unused SettingsBase from line 30

(F811)


40-40: Redefinition of unused SimulationBase from line 31

(F811)


41-41: Redefinition of unused transfer_matching_fields from line 32

(F811)


43-43: Redefinition of unused ArtifactInfo from line 34

(F811)


43-43: Redefinition of unused ResultDataBase from line 34

(F811)


54-54: Use list instead of List for type annotation

Replace with list

(UP006)


58-58: Use list instead of List for type annotation

Replace with list

(UP006)


65-65: Use list instead of List for type annotation

Replace with list

(UP006)


66-66: Use list instead of List for type annotation

Replace with list

(UP006)


77-77: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


98-98: Use list instead of List for type annotation

Replace with list

(UP006)


99-99: Use list instead of List for type annotation

Replace with list

(UP006)


100-100: Use list instead of List for type annotation

Replace with list

(UP006)


101-101: Use list instead of List for type annotation

Replace with list

(UP006)


102-102: Use list instead of List for type annotation

Replace with list

(UP006)


140-140: Use list instead of List for type annotation

Replace with list

(UP006)


141-141: Use list instead of List for type annotation

Replace with list

(UP006)


142-142: Use list instead of List for type annotation

Replace with list

(UP006)


143-143: Use list instead of List for type annotation

Replace with list

(UP006)


146-146: Use list instead of List for type annotation

Replace with list

(UP006)


180-180: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)


181-181: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)


182-182: Use type instead of Type for type annotation

Replace with type

(UP006)


186-186: Use type instead of Type for type annotation

Replace with type

(UP006)


195-195: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


429-429: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)


459-459: Use type instead of Type for type annotation

Replace with type

(UP006)


493-493: Use list instead of List for type annotation

Replace with list

(UP006)


506-506: Use list instead of List for type annotation

Replace with list

(UP006)


510-510: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


525-525: Use dict instead of Dict for type annotation

Replace with dict

(UP006)

dpti/workflows/analyze/build_gibbs_surface.py

6-6: typing.Dict is deprecated, use dict instead

(UP035)


6-6: typing.List is deprecated, use list instead

(UP035)


6-6: typing.Tuple is deprecated, use tuple instead

(UP035)


26-26: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


82-82: Use list instead of List for type annotation

Replace with list

(UP006)


82-82: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


107-107: Use tuple instead of Tuple for type annotation

Replace with tuple

(UP006)


108-108: Use tuple instead of Tuple for type annotation

Replace with tuple

(UP006)


110-110: Use tuple instead of Tuple for type annotation

Replace with tuple

(UP006)


272-272: Use tuple instead of Tuple for type annotation

Replace with tuple

(UP006)


273-273: Use tuple instead of Tuple for type annotation

Replace with tuple

(UP006)


450-450: Use tuple instead of Tuple for type annotation

Replace with tuple

(UP006)


451-451: Use tuple instead of Tuple for type annotation

Replace with tuple

(UP006)


456-456: Docstring contains ambiguous (FULLWIDTH LEFT PARENTHESIS). Did you mean ( (LEFT PARENTHESIS)?

(RUF002)


456-456: Docstring contains ambiguous (FULLWIDTH RIGHT PARENTHESIS). Did you mean ) (RIGHT PARENTHESIS)?

(RUF002)


533-533: Docstring contains ambiguous (FULLWIDTH COMMA). Did you mean , (COMMA)?

(RUF002)


540-540: String contains ambiguous (FULLWIDTH COMMA). Did you mean , (COMMA)?

(RUF001)


557-557: Comment contains ambiguous (FULLWIDTH COLON). Did you mean : (COLON)?

(RUF003)


557-557: Comment contains ambiguous (FULLWIDTH COMMA). Did you mean , (COMMA)?

(RUF003)


558-558: Comment contains ambiguous (FULLWIDTH COLON). Did you mean : (COLON)?

(RUF003)


558-558: Comment contains ambiguous (FULLWIDTH COMMA). Did you mean , (COMMA)?

(RUF003)


559-559: Comment contains ambiguous (FULLWIDTH COLON). Did you mean : (COLON)?

(RUF003)


559-559: Comment contains ambiguous (FULLWIDTH COMMA). Did you mean , (COMMA)?

(RUF003)


760-760: Comment contains ambiguous (FULLWIDTH COMMA). Did you mean , (COMMA)?

(RUF003)

dpti/workflows/analyze/compare_ti_plot.py

117-117: Do not use bare except

(E722)


332-332: typing.Dict is deprecated, use dict instead

(UP035)


332-332: typing.List is deprecated, use list instead

(UP035)


336-336: Use list instead of List for type annotation

Replace with list

(UP006)


336-336: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


337-337: Use list instead of List for type annotation

Replace with list

(UP006)


337-337: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


340-340: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


399-399: Comment contains ambiguous (FULLWIDTH LEFT PARENTHESIS). Did you mean ( (LEFT PARENTHESIS)?

(RUF003)


399-399: Comment contains ambiguous (FULLWIDTH RIGHT PARENTHESIS). Did you mean ) (RIGHT PARENTHESIS)?

(RUF003)


416-416: Comment contains ambiguous (FULLWIDTH LEFT PARENTHESIS). Did you mean ( (LEFT PARENTHESIS)?

(RUF003)


416-416: Comment contains ambiguous (FULLWIDTH RIGHT PARENTHESIS). Did you mean ) (RIGHT PARENTHESIS)?

(RUF003)


434-434: Docstring contains ambiguous (FULLWIDTH LEFT PARENTHESIS). Did you mean ( (LEFT PARENTHESIS)?

(RUF002)


434-434: Docstring contains ambiguous (FULLWIDTH RIGHT PARENTHESIS). Did you mean ) (RIGHT PARENTHESIS)?

(RUF002)


465-465: Comment contains ambiguous (FULLWIDTH COMMA). Did you mean , (COMMA)?

(RUF003)


473-473: Comment contains ambiguous (FULLWIDTH COMMA). Did you mean , (COMMA)?

(RUF003)


473-473: Comment contains ambiguous (FULLWIDTH LEFT PARENTHESIS). Did you mean ( (LEFT PARENTHESIS)?

(RUF003)


473-473: Comment contains ambiguous (FULLWIDTH RIGHT PARENTHESIS). Did you mean ) (RIGHT PARENTHESIS)?

(RUF003)

tests/workflows/flows/test_basemodel.py

4-4: typing.Dict is deprecated, use dict instead

(UP035)


17-17: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


34-34: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


35-35: Docstring contains ambiguous (FULLWIDTH COMMA). Did you mean , (COMMA)?

(RUF002)


57-57: Comment contains ambiguous (FULLWIDTH COMMA). Did you mean , (COMMA)?

(RUF003)


66-66: Docstring contains ambiguous (FULLWIDTH COMMA). Did you mean , (COMMA)?

(RUF002)


67-67: String contains ambiguous (FULLWIDTH COMMA). Did you mean , (COMMA)?

(RUF001)


72-72: Comment contains ambiguous (FULLWIDTH COMMA). Did you mean , (COMMA)?

(RUF003)

dpti/workflows/simulations/equi_sim.py

2-2: typing.Dict is deprecated, use dict instead

(UP035)


2-2: typing.Type is deprecated, use type instead

(UP035)


20-20: Redefinition of unused SettingsBase from line 16

(F811)


20-20: Redefinition of unused SimulationBase from line 16

(F811)


20-20: Redefinition of unused workflow_task from line 16

(F811)


88-88: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


89-89: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


105-105: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


110-110: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)


112-112: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)


115-115: Use type instead of Type for type annotation

Replace with type

(UP006)


137-137: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


174-174: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


192-192: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


268-268: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


276-276: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


280-280: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)


282-282: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)


284-284: Use type instead of Type for type annotation

Replace with type

(UP006)


296-296: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


347-347: Use dict instead of Dict for type annotation

Replace with dict

(UP006)

dpti/workflows/simulations/hti_sim.py

10-18: typing.Dict is deprecated, use dict instead

(UP035)


10-18: typing.List is deprecated, use list instead

(UP035)


10-18: typing.Type is deprecated, use type instead

(UP035)


37-37: Redefinition of unused FreeEnergyValuePoint from line 30

(F811)


38-38: Redefinition of unused SimulationBase from line 31

(F811)


50-50: Use list instead of List for type annotation

Replace with list

(UP006)


53-53: Use list instead of List for type annotation

Replace with list

(UP006)


59-59: Use list instead of List for type annotation

Replace with list

(UP006)


60-60: Use list instead of List for type annotation

Replace with list

(UP006)


68-68: Use list instead of List for type annotation

Replace with list

(UP006)


87-87: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


118-118: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


139-139: Use list instead of List for type annotation

Replace with list

(UP006)


146-146: Use list instead of List for type annotation

Replace with list

(UP006)


157-157: Use list instead of List for type annotation

Replace with list

(UP006)


189-189: Use list instead of List for type annotation

Replace with list

(UP006)


384-384: Use list instead of List for type annotation

Replace with list

(UP006)


416-416: Use list instead of List for type annotation

Replace with list

(UP006)


452-452: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


463-463: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


468-468: Use list instead of List for type annotation

Replace with list

(UP006)


490-490: Use type instead of Type for type annotation

Replace with type

(UP006)


493-493: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


493-493: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)


507-507: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


603-603: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


651-651: Use list instead of List for type annotation

Replace with list

(UP006)


651-651: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


772-772: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


784-784: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


793-793: Use list instead of List for type annotation

Replace with list

(UP006)


855-855: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


867-867: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


884-884: Use list instead of List for type annotation

Replace with list

(UP006)


918-918: Use list instead of List for type annotation

Replace with list

(UP006)


934-934: Use list instead of List for type annotation

Replace with list

(UP006)


958-958: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


970-970: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


987-987: Use list instead of List for type annotation

Replace with list

(UP006)


1041-1041: Use list instead of List for type annotation

Replace with list

(UP006)


1060-1060: Use list instead of List for type annotation

Replace with list

(UP006)


1088-1088: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


1099-1099: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


1111-1111: Use list instead of List for type annotation

Replace with list

(UP006)


1174-1174: Use list instead of List for type annotation

Replace with list

(UP006)


1259-1259: Use list instead of List for type annotation

Replace with list

(UP006)

tests/workflows/service/test_file_handler.py

66-66: SyntaxError: Expected a statement


66-66: SyntaxError: Expected a statement


66-66: SyntaxError: Expected a statement


66-66: SyntaxError: Expected a statement


67-67: SyntaxError: Unexpected indentation


73-73: SyntaxError: Expected a statement


73-73: SyntaxError: Expected a statement


73-73: SyntaxError: Expected a statement


73-73: SyntaxError: Expected a statement


73-74: SyntaxError: Expected a statement


74-74: SyntaxError: Unexpected indentation


80-80: SyntaxError: Expected a statement


80-80: SyntaxError: Expected a statement


80-80: SyntaxError: Expected a statement


80-80: SyntaxError: Expected a statement


82-82: SyntaxError: Unexpected indentation


138-138: SyntaxError: Expected a statement


138-138: SyntaxError: Expected a statement


138-138: SyntaxError: Expected a statement


138-138: SyntaxError: Expected a statement


139-139: SyntaxError: Unexpected indentation


152-152: SyntaxError: Expected a statement


152-152: SyntaxError: Expected a statement


152-152: SyntaxError: Expected a statement


152-152: SyntaxError: Expected a statement


152-153: SyntaxError: Expected a statement


153-153: SyntaxError: Unexpected indentation


170-170: SyntaxError: Expected a statement


170-170: SyntaxError: Expected a statement


170-170: SyntaxError: Expected a statement


170-170: SyntaxError: Expected a statement


172-172: SyntaxError: Unexpected indentation


210-210: SyntaxError: Expected a statement

🪛 Shellcheck (0.10.0)
dpti/workflows/examples/Sn_SCAN_high_pressure/wget_graph.pb.sh

[error] 1-1: Tips depend on target shell and yours is unknown. Add a shebang or a 'shell' directive.

(SC2148)

dpti/workflows/examples/Water_Ice_Phase_Diagram/wget_graph.pb.sh

[error] 1-1: Tips depend on target shell and yours is unknown. Add a shebang or a 'shell' directive.

(SC2148)

tests/run_test.sh

[error] 1-1: Tips depend on target shell and yours is unknown. Add a shebang or a 'shell' directive.

(SC2148)

🔇 Additional comments (27)
dpti/workflows/examples/Water_Ice_Phase_Diagram/ti_ice04_150K_10000bar_settings.yaml (1)

29-45: Review NVT Simulation Settings

The NVT simulation settings appear adequate. Just verify that the referenced equilibrium configuration file (npt_avg.lmp) exists and is generated as part of the workflow.

tests/workflows/flows/test_basemodel.py (1)

26-31: LGTM! Clean implementation of computed field.

The temp_str computed field using cached_property is implemented well and efficiently formats the temperature value.

dpti/workflows/examples/Sn_SCAN_high_pressure/ti_line_beta144xy.yaml (2)

1-96: LGTM - Well-structured configuration file

The overall structure of this configuration file is well-organized, with clear separation between different simulation types and logical parameter grouping. This will make it easy for users to understand and modify the simulation parameters.


48-65:

❓ Verification inconclusive

Verify the lambda step sizes in HTI simulation settings.

The lambda step sizes in the HTI simulation settings should be carefully reviewed to ensure they provide sufficient sampling across phase space, especially in critical regions where the free energy changes rapidly.


🌐 Web query:

What are best practices for lambda step sizes in Hamiltonian thermodynamic integration?

💡 Result:

When determining optimal lambda (λ) step sizes for Hamiltonian thermodynamic integration (TI), key considerations include balancing computational efficiency with integration accuracy, particularly in regions where the free-energy gradient (⟨∂H/∂λ⟩) varies nonlinearly. Below are evidence-based best practices:

1. Adaptive λ Spacing Based on Curvature

  • High-curvature regions (e.g., where ⟨∂H/∂λ⟩ changes rapidly) require denser λ spacing to minimize quadrature errors[3][7][11]. For example:
    • Lennard-Jones (LJ) transformations (e.g., van der Waals interactions) often need smaller steps than electrostatic changes due to stronger nonlinearities[11][19].
    • Soft-core potentials (e.g., with α = 0.5 for LJ and β = 0.5 nm² for Coulomb interactions) smooth energy discontinuities, allowing larger steps in regions near λ = 0/1[1][7][8].

2. Integration Methods and Step Count

  • Trapezoidal rule is robust for numerical integration due to inherent statistical uncertainties, favoring moderate step counts (e.g., 10–20 λ windows)[9][11].
  • Extended TI predicts ⟨∂H/∂λ⟩ at nonsimulated λ points, reducing required simulations by ~50% while maintaining accuracy[3][7].
  • For multi-state systems (e.g., ligand binding), use Hamiltonian replica exchange (HREMD) to enhance phase-space overlap between λ windows, enabling sparser spacing[6][18].

3. System-Specific Recommendations

System Type λ-Step Strategy
Hydration free energy 10–20 equidistant steps with soft-core potentials[7][19].
Ligand binding 20–30 steps, denser near λ=0/1 for LJ interactions; use replica exchange[6][12].
Solid-state transitions Avoid phase transitions; use intermediate λ states for structural stability[17].

4. Validation and Optimization

  • Curvature analysis: Compute the second derivative ∂²H/∂λ² to identify regions needing refinement[3][7].
  • Overlap metrics: Ensure adjacent λ windows have >10% phase-space overlap (via energy histogram overlap)[6][11].
  • Error estimation: Compare results from trapezoidal vs. Simpson’s rule; discrepancies >1 kBT indicate undersampling[9][11].

5. Enhanced Sampling Techniques

  • Smoothstep parameterization (e.g., S2(λ) = 6λ⁵ – 15λ⁴ + 10λ³) reduces curvature by smoothing λ-dependent transitions[8][19].
  • Alchemical pathways: Optimize λ-dependent Hamiltonians to minimize variance in ⟨∂H/∂λ⟩, enabling larger steps[7][12].

Practical Workflow

  1. Initial scan: Run coarse λ simulations (e.g., Δλ = 0.1) to identify high-curvature regions.
  2. Refinement: Add intermediate λ points (e.g., Δλ = 0.02–0.05) in problematic areas[3][11].
  3. Production: Use HREMD or MBAR to combine data across λ windows for final integration[6][7].

By combining adaptive spacing, enhanced sampling, and system-specific adjustments, TI calculations achieve reliable free-energy estimates with minimized computational cost[1][3][7].

Citations:


Action: Verify and Validate Lambda Step Sizes in HTI Simulation Settings

Based on recent best practices for Hamiltonian thermodynamic integration, please review the current lambda spacing to ensure sufficient sampling in regions with rapidly changing free-energy gradients. Consider the following:

  • Adaptive Spacing:
    Confirm that each lambda sequence (for lambda_lj_on, lambda_deep_on, and lambda_spring_off) uses a denser window in high-curvature regions. For instance, while the early sections (e.g., ranges starting at 0.000) are finely spaced, verify that the larger steps later (such as the 0.200–1.000 segment in lambda_lj_on) do not miss critical nonlinear variations.

  • Integration Accuracy:
    Validate that the step sizes provide adequate phase-space overlap (ideally ≥10%) and consider performing a curvature or error analysis (e.g., comparing trapezoidal vs. Simpson’s rule) to assess the integration’s sensitivity.

  • System-Specific Considerations:
    Ensure that these choices align with your simulation objectives. For example, if using LJ transformations where the free-energy gradient may vary quickly, assess whether finer λ intervals are necessary throughout the entire range.

Please adjust or confirm the lambda settings accordingly to achieve robust free-energy estimates.

tests/workflows/simulations/test_hti_sim.py (1)

1-130: Test suite looks well structured and comprehensive

The test suite for the HTISimulation class follows good testing practices:

  • Proper use of mock objects to isolate the system under test
  • Well-defined setup method to prepare test fixtures
  • Clear test methods that focus on specific functionality
  • Good coverage of different aspects of the HTISimulation class
tests/workflows/conftest.py (1)

94-101: Good implementation of mock decorators

The mock decorator implementations for task and flow are clean and effective. This approach allows for easier testing of code that uses Prefect decorators without needing the actual Prefect runtime.

dpti/workflows/examples/Sn_SCAN_high_pressure/ti_line_long_liquid_1200K_10GPa_path_t_settings.yaml (4)

1-13: LGTM! Thermo input section is well structured.

The thermo input section provides clear parameters for the simulation, with appropriate type handling and logical grouping of related settings.


2-2: Verify the temperature consistency between target_temp and temp_seq.

Similar to the pressure settings, there's a discrepancy between the initial target temperature (1200K) and the temperature sequence for TI simulation (200:1200:20). Confirm that this temperature range is expected for this specific simulation.

Also applies to: 96-96


46-84: Lambda sequences for HTI simulation look well designed.

The lambda sequences for soft-on, deep-on, and soft-off transitions use appropriate non-linear spacing with finer resolution in critical regions, which should lead to better numerical accuracy in the thermodynamic integration.


3-3:

❓ Verification inconclusive

Verify the pressure consistency between target_pres and pres_seq.

There appears to be a discrepancy between the initial target pressure (0) and the pressure sequence used in TI simulation (0:100000:1000). This might be intentional for the TI path, but it's worth confirming that this large pressure range is expected.

Also applies to: 97-97


🏁 Script executed:

#!/bin/bash
# Check if there are other similar configuration files with pressure sequences
grep -r "target_pres\|pres_seq" --include="*.yaml" dpti/workflows/examples/

Length of output: 759


Action: Verify pressure configuration consistency across TI simulation settings

Please confirm that setting target_pres: 0 in this file is intentional. In this configuration, the pressure sequence is defined as ["0:100000:1000", "100000"], meaning the simulation starts at 0 Pa and ramps up, whereas a similar TI configuration in ti_line_beta144xy.yaml uses target_pres: 50000 with a slightly different sequence. Also, note that the issue appears again at line 97. Verify that the large pressure range and differing target pressure values are expected for the distinct simulation paths.

dpti/workflows/flows/base_flow.py (2)

103-147: LGTM! The flow_start_check method is well-implemented.

The method properly checks if the flow trigger directory exists and sets up the flow running directory using appropriate error handling.


163-166: The _execute_impl abstract method ensures proper workflow implementation.

Using an abstract method here enforces that subclasses must implement their own execution logic, which is good design for a base class.

tests/workflows/flows/test_flow_instance_method.py (1)

25-46: Well-designed workflow function creation utility.

The create_workflow_flow function is a good abstraction for creating flow functions from workflow class methods. It properly handles arguments and preserves the method functionality.

tests/workflows/flows/test_prefect_decorator.py (2)

49-63: Good implementation of class method decoration with proper closure handling.

The implementation correctly handles the decoration of class methods, using closures to avoid variable scope issues. This approach ensures that each method gets its own separate configuration.


127-136: Well-structured workflow definition.

The workflow function properly orchestrates the execution of tasks, leveraging both the decorated function and class methods in a clear, sequential manner.

dpti/workflows/analyze/compare_ti_plot.py (1)

186-280: Potential memory concern with multiple JSON loads.

The function compare_multiple_runs loads multiple JSON files into memory. While likely acceptable here, ensure that large or numerous files won't exceed memory when scaling.

dpti/workflows/simulations/ti_sim.py (2)

276-292: Ensure asynchronous tasks are used consistently.

The _run method uses asyncio.get_event_loop() but the rest of the code does not show further concurrency usage. Verify that the event loop does not conflict with other asynchronous tasks in your system.


294-385: Return type mismatches can cause confusion.

Currently, _extract returns TISimulationResultData for a normal flow but also references a fallback ti_info dictionary. Confirm that your return types are consistent or update docstrings accordingly so consumers know exactly what to expect.

dpti/workflows/simulations/hti_sim.py (3)

476-594: Validate Python version compatibility for match statements.

The match blocks in your code require Python 3.10+. Confirm that your runtime environment meets this requirement, or use alternative control flows if broader compatibility is desired.

🧰 Tools
🪛 Ruff (0.8.2)

490-490: Use type instead of Type for type annotation

Replace with type

(UP006)


493-493: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


493-493: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)


507-507: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


753-830: Consistent naming of integration steps.

Your steps are labeled "lj_on", "deep_on", "spring_off". Validate that these step patterns align with any external scripts or naming conventions to avoid confusion.

🧰 Tools
🪛 Ruff (0.8.2)

772-772: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


784-784: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


793-793: Use list instead of List for type annotation

Replace with list

(UP006)


836-911: Ensure ice-specific logic in IceHTI is tested thoroughly.

The handle_compute call in IceHTI references disorder_corr and partial_disorder. Pinpoint coverage in tests for these fields, especially for advanced or corner cases.

🧰 Tools
🪛 Ruff (0.8.2)

855-855: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


867-867: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


884-884: Use list instead of List for type annotation

Replace with list

(UP006)

dpti/workflows/simulations/equi_sim.py (3)

100-107: Ensure initialization covers all required services.

NPTEquiSimulation depends on BasicWorkflowServices set later via _init_workflow_services_from_injector. Consider verifying that workflow_services is bound before calling any methods that rely on it, especially if future refactors might invoke _run or _extract before _prepare.

Are you certain all calls to methods that need workflow_services only occur after _init_workflow_services_from_injector? Double-check any external usage.

🧰 Tools
🪛 Ruff (0.8.2)

105-105: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


174-191: Enhance error handling in _extract.

If equi.post_task fails or returns incomplete data, subsequent steps may break silently. Consider verifying essential fields (e.g., info_dict) or capturing exceptions to reduce debugging overhead.

🧰 Tools
🪛 Ruff (0.8.2)

174-174: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


296-334: Clarify logic for using upstream data.

In _prepare, the fallback to self.node_settings.equi_conf occurs if there's no prev_results. Confirm this is always the intended default behavior. A misconfigured pipeline could skip important data from an upstream step.

🧰 Tools
🪛 Ruff (0.8.2)

296-296: Use dict instead of Dict for type annotation

Replace with dict

(UP006)

dpti/workflows/analyze/build_gibbs_surface.py (1)

421-447: Ensure phase_data structure is well-defined before saving.

In save_data, each phase is appended but not necessarily validated for format. If any sub-dictionary is missing required keys like "temps", "press", or "fe", the code might break or produce partial JSON.

dpti/workflows/service/workflow_service_module.py (2)

67-70: Validate directory arguments in module initialization.

While flow_trigger_dir and flow_running_dirname are set, consider verifying their existence on disk or providing user-friendly warnings if they do not exist. This helps catch filesystem-related errors early.


94-109: Ensure correct injection of dependencies.

provide_basic_workflow_services calls rely on the provider pattern. If the binding order or injection is misconfigured, it might fail silently. A small test might confirm the correct interplay between io_handler, job_executor, and report_generator.

Comment on lines +81 to +93
if __name__ == "__main__":
# 部署工作流
free_energy_workflow.serve(
name="test-basemodel-deployment",
tags=["onboarding"],
parameters={
"thermo_input": {
"temperature": 298.15,
"pressure": 1.0,
"simulation_steps": 5000,
}
},
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add test cases to this test file.

This file is in the tests directory but doesn't contain actual test cases using a testing framework like pytest:

Since this is a test file (based on the filename and location), consider adding actual test cases that verify the functionality of the SimulationConfig model and the free_energy_workflow function. For example:

import pytest

def test_simulation_config_default_values():
    config = SimulationConfig(temperature=300, pressure=1.0)
    assert config.simulation_steps == 1000
    assert config.temp_str == "300K"
    assert config.yaml_dict == {}

def test_yaml_dict_parsing():
    # Test dictionary input
    config = SimulationConfig(
        temperature=300, 
        pressure=1.0,
        some_yaml={"key": "value"}
    )
    assert config.yaml_dict == {"key": "value"}
    
    # Test YAML string input
    config = SimulationConfig(
        temperature=300, 
        pressure=1.0,
        some_yaml="key: value"
    )
    assert config.yaml_dict == {"key": "value"}

def test_free_energy_workflow():
    config = SimulationConfig(temperature=100, pressure=10)
    result = free_energy_workflow(config)
    assert result["result"] == 10.0
    assert result["thermo_input"]["temperature"] == 100

This would ensure that the functionality works as expected and serves as documentation for others.

Comment on lines +32 to +61
@computed_field
@cached_property
def yaml_dict(self) -> Dict[str, Any]:
"""解析some_yaml为字典,无论它是何种形式."""
# 已经是字典
print(f"to get yaml_dict: {self.some_yaml=} to be parsed")
yaml_dict = {}
if isinstance(self.some_yaml, dict):
yaml_dict = self.some_yaml

# 处理字符串情况
if isinstance(self.some_yaml, str):
# 文件路径
if os.path.exists(self.some_yaml) and self.some_yaml.endswith(
(".yaml", ".yml")
):
with open(self.some_yaml) as f:
yaml_dict = yaml.safe_load(f) or {}

# YAML内容字符串
try:
parsed = yaml.safe_load(self.some_yaml)
if isinstance(parsed, dict):
yaml_dict = parsed
except Exception:
pass # 解析失败,返回空字典

print(f"parse finished, yaml_dict: {yaml_dict=}")
# 不可解析情况返回空字典
return yaml_dict
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Fix internationalization and improve error handling in yaml_dict method.

The yaml_dict method has several issues that should be addressed:

  1. Replace fullwidth commas with standard commas in docstrings and comments
  2. Update type annotation from Dict to dict
  3. Improve error handling and logging during YAML parsing
  4. Consider adding more specific exception types
    @computed_field
    @cached_property
-    def yaml_dict(self) -> Dict[str, Any]:
-        """解析some_yaml为字典,无论它是何种形式."""
+    def yaml_dict(self) -> dict[str, Any]:
+        """Parse some_yaml into a dictionary, regardless of its form."""
        # 已经是字典
        print(f"to get yaml_dict: {self.some_yaml=} to be parsed")
        yaml_dict = {}
        if isinstance(self.some_yaml, dict):
            yaml_dict = self.some_yaml

        # 处理字符串情况
        if isinstance(self.some_yaml, str):
            # 文件路径
            if os.path.exists(self.some_yaml) and self.some_yaml.endswith(
                (".yaml", ".yml")
            ):
                with open(self.some_yaml) as f:
-                    yaml_dict = yaml.safe_load(f) or {}
+                    try:
+                        yaml_dict = yaml.safe_load(f) or {}
+                    except yaml.YAMLError as e:
+                        print(f"Error loading YAML file: {e}")
+                        # Return empty dict on error

            # YAML内容字符串
            try:
                parsed = yaml.safe_load(self.some_yaml)
                if isinstance(parsed, dict):
                    yaml_dict = parsed
-            except Exception:
-                pass  # 解析失败,返回空字典
+            except yaml.YAMLError as e:
+                print(f"Error parsing YAML string: {e}")
+                # Parsing failed, returning empty dict

        print(f"parse finished, yaml_dict: {yaml_dict=}")
        # 不可解析情况返回空字典
        return yaml_dict

Additionally, consider using a proper logging system instead of print statements for better debugging capabilities.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
@computed_field
@cached_property
def yaml_dict(self) -> Dict[str, Any]:
"""解析some_yaml为字典,无论它是何种形式."""
# 已经是字典
print(f"to get yaml_dict: {self.some_yaml=} to be parsed")
yaml_dict = {}
if isinstance(self.some_yaml, dict):
yaml_dict = self.some_yaml
# 处理字符串情况
if isinstance(self.some_yaml, str):
# 文件路径
if os.path.exists(self.some_yaml) and self.some_yaml.endswith(
(".yaml", ".yml")
):
with open(self.some_yaml) as f:
yaml_dict = yaml.safe_load(f) or {}
# YAML内容字符串
try:
parsed = yaml.safe_load(self.some_yaml)
if isinstance(parsed, dict):
yaml_dict = parsed
except Exception:
pass # 解析失败,返回空字典
print(f"parse finished, yaml_dict: {yaml_dict=}")
# 不可解析情况返回空字典
return yaml_dict
@computed_field
@cached_property
def yaml_dict(self) -> dict[str, Any]:
"""Parse some_yaml into a dictionary, regardless of its form."""
# 已经是字典
print(f"to get yaml_dict: {self.some_yaml=} to be parsed")
yaml_dict = {}
if isinstance(self.some_yaml, dict):
yaml_dict = self.some_yaml
# 处理字符串情况
if isinstance(self.some_yaml, str):
# 文件路径
if os.path.exists(self.some_yaml) and self.some_yaml.endswith((".yaml", ".yml")):
with open(self.some_yaml) as f:
try:
yaml_dict = yaml.safe_load(f) or {}
except yaml.YAMLError as e:
print(f"Error loading YAML file: {e}")
# Return empty dict on error
# YAML内容字符串
try:
parsed = yaml.safe_load(self.some_yaml)
if isinstance(parsed, dict):
yaml_dict = parsed
except yaml.YAMLError as e:
print(f"Error parsing YAML string: {e}")
# Parsing failed, returning empty dict
print(f"parse finished, yaml_dict: {yaml_dict=}")
# 不可解析情况返回空字典
return yaml_dict
🧰 Tools
🪛 Ruff (0.8.2)

34-34: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


35-35: Docstring contains ambiguous (FULLWIDTH COMMA). Did you mean , (COMMA)?

(RUF002)


57-57: Comment contains ambiguous (FULLWIDTH COMMA). Did you mean , (COMMA)?

(RUF003)

Comment on lines +65 to +80
file_paths = ["test_file.txt"]
<<<<<<< HEAD
rel_symlinks = self.file_handler.upload_files(file_paths, self.flow_trigger_dir)
self.assertEqual(len(rel_symlinks), 1)
# 构建完整路径用于验证符号链接
abs_symlink_path = os.path.join(self.flow_running_dir, rel_symlinks[0])
self.assertTrue(os.path.islink(abs_symlink_path))
self.assertIn(rel_symlinks[0], self.file_handler.current_produced_paths)
=======
produced_symlinks = self.file_handler.upload_files(
file_paths, self.flow_trigger_dir
)
self.assertEqual(len(produced_symlinks), 1)
self.assertTrue(os.path.islink(produced_symlinks[0]))
self.assertIn(produced_symlinks[0], self.file_handler.current_produced_paths)
>>>>>>> b644dedbcc6d0cb76d7cb250440013f6122c4228
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Resolve merge conflict in test_upload_files method

There's an unresolved merge conflict in this file. You need to decide which implementation to keep.

Resolve the conflict by removing the conflict markers and keeping either the HEAD version or the merged version. Based on the context, the newer implementation (lines 74-79) seems more appropriate as it uses proper relative paths.

-<<<<<<< HEAD
-        rel_symlinks = self.file_handler.upload_files(file_paths, self.flow_trigger_dir)
-        self.assertEqual(len(rel_symlinks), 1)
-        # 构建完整路径用于验证符号链接
-        abs_symlink_path = os.path.join(self.flow_running_dir, rel_symlinks[0])
-        self.assertTrue(os.path.islink(abs_symlink_path))
-        self.assertIn(rel_symlinks[0], self.file_handler.current_produced_paths)
-=======
        produced_symlinks = self.file_handler.upload_files(
            file_paths, self.flow_trigger_dir
        )
        self.assertEqual(len(produced_symlinks), 1)
        self.assertTrue(os.path.islink(produced_symlinks[0]))
        self.assertIn(produced_symlinks[0], self.file_handler.current_produced_paths)
->>>>>>> b644dedbcc6d0cb76d7cb250440013f6122c4228
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
file_paths = ["test_file.txt"]
<<<<<<< HEAD
rel_symlinks = self.file_handler.upload_files(file_paths, self.flow_trigger_dir)
self.assertEqual(len(rel_symlinks), 1)
# 构建完整路径用于验证符号链接
abs_symlink_path = os.path.join(self.flow_running_dir, rel_symlinks[0])
self.assertTrue(os.path.islink(abs_symlink_path))
self.assertIn(rel_symlinks[0], self.file_handler.current_produced_paths)
=======
produced_symlinks = self.file_handler.upload_files(
file_paths, self.flow_trigger_dir
)
self.assertEqual(len(produced_symlinks), 1)
self.assertTrue(os.path.islink(produced_symlinks[0]))
self.assertIn(produced_symlinks[0], self.file_handler.current_produced_paths)
>>>>>>> b644dedbcc6d0cb76d7cb250440013f6122c4228
file_paths = ["test_file.txt"]
produced_symlinks = self.file_handler.upload_files(
file_paths, self.flow_trigger_dir
)
self.assertEqual(len(produced_symlinks), 1)
self.assertTrue(os.path.islink(produced_symlinks[0]))
self.assertIn(produced_symlinks[0], self.file_handler.current_produced_paths)
🧰 Tools
🪛 Ruff (0.8.2)

66-66: SyntaxError: Expected a statement


66-66: SyntaxError: Expected a statement


66-66: SyntaxError: Expected a statement


66-66: SyntaxError: Expected a statement


67-67: SyntaxError: Unexpected indentation


73-73: SyntaxError: Expected a statement


73-73: SyntaxError: Expected a statement


73-73: SyntaxError: Expected a statement


73-73: SyntaxError: Expected a statement


73-74: SyntaxError: Expected a statement


74-74: SyntaxError: Unexpected indentation


80-80: SyntaxError: Expected a statement


80-80: SyntaxError: Expected a statement


80-80: SyntaxError: Expected a statement


80-80: SyntaxError: Expected a statement

Comment on lines +138 to +170
<<<<<<< HEAD
rel_symlinks = self.file_handler.link_files(["test_file.txt"], self.flow_trigger_dir)
self.assertEqual(len(rel_symlinks), 1)
# 构建完整路径用于验证符号链接
abs_symlink_path = os.path.join(self.flow_running_dir, rel_symlinks[0])
self.assertTrue(os.path.islink(abs_symlink_path))
self.assertEqual(os.path.basename(abs_symlink_path), "test_file.txt")

link_target = os.readlink(abs_symlink_path)
abs_target = os.path.join(os.path.dirname(abs_symlink_path), link_target)
self.assertTrue(os.path.samefile(abs_target, test_file_path))

with self.assertRaises(RuntimeError):
self.file_handler.link_files(["non_existent_file.txt"], self.flow_trigger_dir)
=======
produced_symlinks = self.file_handler.link_files(
["test_file.txt"], self.flow_trigger_dir
)
self.assertEqual(len(produced_symlinks), 1)
self.assertTrue(os.path.islink(produced_symlinks[0]))
self.assertEqual(os.path.basename(produced_symlinks[0]), "test_file.txt")

link_target = os.readlink(produced_symlinks[0])
self.assertEqual(
link_target,
os.path.relpath(test_file_path, start=self.file_handler.job_dir),
)

with self.assertRaises(RuntimeError):
self.file_handler.link_files(
["non_existent_file.txt"], self.flow_trigger_dir
)
>>>>>>> b644dedbcc6d0cb76d7cb250440013f6122c4228
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Resolve merge conflict in test_link_files method

There's another unresolved merge conflict in this file. You need to decide which implementation to keep.

Similar to the previous conflict, resolve this one by removing the conflict markers and keeping either the HEAD version or the merged version. The newer implementation (lines 153-169) appears to be more thorough in its validation.

-<<<<<<< HEAD
-        rel_symlinks = self.file_handler.link_files(["test_file.txt"], self.flow_trigger_dir)
-        self.assertEqual(len(rel_symlinks), 1)
-        # 构建完整路径用于验证符号链接
-        abs_symlink_path = os.path.join(self.flow_running_dir, rel_symlinks[0])
-        self.assertTrue(os.path.islink(abs_symlink_path))
-        self.assertEqual(os.path.basename(abs_symlink_path), "test_file.txt")
-
-        link_target = os.readlink(abs_symlink_path)
-        abs_target = os.path.join(os.path.dirname(abs_symlink_path), link_target)
-        self.assertTrue(os.path.samefile(abs_target, test_file_path))
-
-        with self.assertRaises(RuntimeError):
-            self.file_handler.link_files(["non_existent_file.txt"], self.flow_trigger_dir)
-=======
        produced_symlinks = self.file_handler.link_files(
            ["test_file.txt"], self.flow_trigger_dir
        )
        self.assertEqual(len(produced_symlinks), 1)
        self.assertTrue(os.path.islink(produced_symlinks[0]))
        self.assertEqual(os.path.basename(produced_symlinks[0]), "test_file.txt")

        link_target = os.readlink(produced_symlinks[0])
        self.assertEqual(
            link_target,
            os.path.relpath(test_file_path, start=self.file_handler.job_dir),
        )

        with self.assertRaises(RuntimeError):
            self.file_handler.link_files(
                ["non_existent_file.txt"], self.flow_trigger_dir
            )
->>>>>>> b644dedbcc6d0cb76d7cb250440013f6122c4228
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
<<<<<<< HEAD
rel_symlinks = self.file_handler.link_files(["test_file.txt"], self.flow_trigger_dir)
self.assertEqual(len(rel_symlinks), 1)
# 构建完整路径用于验证符号链接
abs_symlink_path = os.path.join(self.flow_running_dir, rel_symlinks[0])
self.assertTrue(os.path.islink(abs_symlink_path))
self.assertEqual(os.path.basename(abs_symlink_path), "test_file.txt")
link_target = os.readlink(abs_symlink_path)
abs_target = os.path.join(os.path.dirname(abs_symlink_path), link_target)
self.assertTrue(os.path.samefile(abs_target, test_file_path))
with self.assertRaises(RuntimeError):
self.file_handler.link_files(["non_existent_file.txt"], self.flow_trigger_dir)
=======
produced_symlinks = self.file_handler.link_files(
["test_file.txt"], self.flow_trigger_dir
)
self.assertEqual(len(produced_symlinks), 1)
self.assertTrue(os.path.islink(produced_symlinks[0]))
self.assertEqual(os.path.basename(produced_symlinks[0]), "test_file.txt")
link_target = os.readlink(produced_symlinks[0])
self.assertEqual(
link_target,
os.path.relpath(test_file_path, start=self.file_handler.job_dir),
)
with self.assertRaises(RuntimeError):
self.file_handler.link_files(
["non_existent_file.txt"], self.flow_trigger_dir
)
>>>>>>> b644dedbcc6d0cb76d7cb250440013f6122c4228
produced_symlinks = self.file_handler.link_files(
["test_file.txt"], self.flow_trigger_dir
)
self.assertEqual(len(produced_symlinks), 1)
self.assertTrue(os.path.islink(produced_symlinks[0]))
self.assertEqual(os.path.basename(produced_symlinks[0]), "test_file.txt")
link_target = os.readlink(produced_symlinks[0])
self.assertEqual(
link_target,
os.path.relpath(test_file_path, start=self.file_handler.job_dir),
)
with self.assertRaises(RuntimeError):
self.file_handler.link_files(
["non_existent_file.txt"], self.flow_trigger_dir
)
🧰 Tools
🪛 Ruff (0.8.2)

138-138: SyntaxError: Expected a statement


138-138: SyntaxError: Expected a statement


138-138: SyntaxError: Expected a statement


138-138: SyntaxError: Expected a statement


139-139: SyntaxError: Unexpected indentation


152-152: SyntaxError: Expected a statement


152-152: SyntaxError: Expected a statement


152-152: SyntaxError: Expected a statement


152-152: SyntaxError: Expected a statement


152-153: SyntaxError: Expected a statement


153-153: SyntaxError: Unexpected indentation


170-170: SyntaxError: Expected a statement


170-170: SyntaxError: Expected a statement


170-170: SyntaxError: Expected a statement


170-170: SyntaxError: Expected a statement

Comment on lines +73 to +75
workflow.execute_greeting.serve(
name="greeting-workflow", parameters={"message": "world"}
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Move workflow.execute_greeting.serve() call inside a conditional block.

The serve() method call in the main block will execute whenever this file is imported or run, which could be unintended. Consider moving it inside a conditional block or commenting it out.

    workflow = GreetingWorkflow()
-   workflow.execute_greeting.serve(
-       name="greeting-workflow", parameters={"message": "world"}
-   )
+   # Uncomment to deploy to server when explicitly needed
+   # workflow.execute_greeting.serve(
+   #     name="greeting-workflow", parameters={"message": "world"}
+   # )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
workflow.execute_greeting.serve(
name="greeting-workflow", parameters={"message": "world"}
)
workflow = GreetingWorkflow()
# Uncomment to deploy to server when explicitly needed
# workflow.execute_greeting.serve(
# name="greeting-workflow", parameters={"message": "world"}
# )

Comment on lines +161 to +163
process_workflow.serve(
name="data-processor", parameters={"input_data": {"default": "data"}}
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Move process_workflow.serve() call inside a conditional block or function.

The serve() method call will execute whenever this file is imported or run, which could disrupt tests or other imports. Consider moving it to a separate function that's explicitly called when deployment is needed.

if __name__ == "__main__":
    test_basics()
-   process_workflow.serve(
-       name="data-processor", parameters={"input_data": {"default": "data"}}
-   )
+   # Call this function when deployment is needed
+   def deploy_workflow():
+       process_workflow.serve(
+           name="data-processor", parameters={"input_data": {"default": "data"}}
+       )
+   
+   # Uncomment the next line to deploy
+   # deploy_workflow()
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
process_workflow.serve(
name="data-processor", parameters={"input_data": {"default": "data"}}
)
if __name__ == "__main__":
test_basics()
# Call this function when deployment is needed
def deploy_workflow():
process_workflow.serve(
name="data-processor", parameters={"input_data": {"default": "data"}}
)
# Uncomment the next line to deploy
# deploy_workflow()

Comment on lines +428 to +489
class TIIntegraionPath:
_instances = {}

def __init__(
self,
path: str,
point_field_name: str,
path_field_name: str,
const_thermo_name: str,
job_dirname: str,
):
self.path = path # 't' or 'p'
self.point_field_name = point_field_name
self.path_field_name = path_field_name
self.const_thermo_name = const_thermo_name
self.job_dirname = job_dirname
self.is_parsed = False
self.thermo_seq = []
TIIntegraionPath._instances[path] = self

def __repr__(self):
r = (
f"working for {self.path=}"
f" {self.path_field_name=}"
f" {self.job_dirname=}"
f" {self.thermo_seq=}"
)
return r

@classmethod
def get_instance(
cls: Type[TIIntegrationPathType], path: str
) -> TIIntegrationPathType:
instance = cls._instances[path]
return instance

def parse_seq_list(self, thermo_path_seq):
self.thermo_path_seq = thermo_path_seq
self.thermo_points_list = list(
parse_seq(self.thermo_path_seq, protect_eps=None)
)
self.is_parsed = True
return self.thermo_points_list
# temp_list = parse_seq(temp_seq)


t_ti_path = TIIntegraionPath(
path="t",
point_field_name="temp",
path_field_name="temp_seq",
const_thermo_name="pres",
job_dirname="TI_t_sim",
)
p_ti_path = TIIntegraionPath(
path="p",
point_field_name="pres",
path_field_name="pres_seq",
const_thermo_name="temp",
job_dirname="TI_p_sim",
)


Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Remove or unify duplicated imports from .base and .result_base.

Lines 36-43 seem to re-import the same entities already imported at lines 28-35. Consolidating these can reduce confusion and potential overshadowing of definitions.

-from .base import (
-    FreeEnergyValuePoint,
-    SettingsBase,
-    SimulationBase,
-    transfer_matching_fields,
-)
-from .result_base import ArtifactInfo, ResultDataBase
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
class TIIntegraionPath:
_instances = {}
def __init__(
self,
path: str,
point_field_name: str,
path_field_name: str,
const_thermo_name: str,
job_dirname: str,
):
self.path = path # 't' or 'p'
self.point_field_name = point_field_name
self.path_field_name = path_field_name
self.const_thermo_name = const_thermo_name
self.job_dirname = job_dirname
self.is_parsed = False
self.thermo_seq = []
TIIntegraionPath._instances[path] = self
def __repr__(self):
r = (
f"working for {self.path=}"
f" {self.path_field_name=}"
f" {self.job_dirname=}"
f" {self.thermo_seq=}"
)
return r
@classmethod
def get_instance(
cls: Type[TIIntegrationPathType], path: str
) -> TIIntegrationPathType:
instance = cls._instances[path]
return instance
def parse_seq_list(self, thermo_path_seq):
self.thermo_path_seq = thermo_path_seq
self.thermo_points_list = list(
parse_seq(self.thermo_path_seq, protect_eps=None)
)
self.is_parsed = True
return self.thermo_points_list
# temp_list = parse_seq(temp_seq)
t_ti_path = TIIntegraionPath(
path="t",
point_field_name="temp",
path_field_name="temp_seq",
const_thermo_name="pres",
job_dirname="TI_t_sim",
)
p_ti_path = TIIntegraionPath(
path="p",
point_field_name="pres",
path_field_name="pres_seq",
const_thermo_name="temp",
job_dirname="TI_p_sim",
)
# --- Beginning of file dpti/workflows/simulations/ti_sim.py ---
# Keep only one set of these imports (assumed to be originally at lines 28–35)
from .base import (
FreeEnergyValuePoint,
SettingsBase,
SimulationBase,
transfer_matching_fields,
)
from .result_base import ArtifactInfo, ResultDataBase
# The duplicate block below (originally at lines 36–43) has been removed.
# ------------------------------------------------------------------------------
# ... (other imports and code)
class TIIntegraionPath:
_instances = {}
def __init__(
self,
path: str,
point_field_name: str,
path_field_name: str,
const_thermo_name: str,
job_dirname: str,
):
self.path = path # 't' or 'p'
self.point_field_name = point_field_name
self.path_field_name = path_field_name
self.const_thermo_name = const_thermo_name
self.job_dirname = job_dirname
self.is_parsed = False
self.thermo_seq = []
TIIntegraionPath._instances[path] = self
def __repr__(self):
r = (
f"working for {self.path=}"
f" {self.path_field_name=}"
f" {self.job_dirname=}"
f" {self.thermo_seq=}"
)
return r
@classmethod
def get_instance(
cls: Type[TIIntegrationPathType], path: str
) -> TIIntegrationPathType:
instance = cls._instances[path]
return instance
def parse_seq_list(self, thermo_path_seq):
self.thermo_path_seq = thermo_path_seq
self.thermo_points_list = list(
parse_seq(self.thermo_path_seq, protect_eps=None)
)
self.is_parsed = True
return self.thermo_points_list
t_ti_path = TIIntegraionPath(
path="t",
point_field_name="temp",
path_field_name="temp_seq",
const_thermo_name="pres",
job_dirname="TI_t_sim",
)
p_ti_path = TIIntegraionPath(
path="p",
point_field_name="pres",
path_field_name="pres_seq",
const_thermo_name="temp",
job_dirname="TI_p_sim",
)
# --- End of file ---
🧰 Tools
🪛 Ruff (0.8.2)

429-429: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)


459-459: Use type instead of Type for type annotation

Replace with type

(UP006)

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 29

♻️ Duplicate comments (2)
dpti/workflows/prefect/prefect_task_hash.py (1)

3-3: 🛠️ Refactor suggestion

Use built-in dict instead of typing.Dict.

Static analysis hints suggest replacing Dict[str, Any] with dict[str, Any], as typing.Dict is deprecated in newer Python versions.

-from typing import Any, Dict, Optional, Union
+from typing import Any, Optional, Union

-def task_input_json_hash(context: object, arguments: Dict[str, Any]) -> Optional[str]:
+def task_input_json_hash(context: object, arguments: dict[str, Any]) -> Optional[str]:
🧰 Tools
🪛 Ruff (0.8.2)

3-3: typing.Dict is deprecated, use dict instead

(UP035)

dpti/workflows/analyze/compare_ti_plot.py (1)

117-119: ⚠️ Potential issue

Replace bare except: with except Exception:.
Line 117 has a bare except:. This broad handling can improperly hide errors. If you must capture all possible exceptions, use except Exception: or specific exceptions to avoid silent failures.

-    except:
+    except Exception:
         print("No intersection point found in the overlapping temperature range")
🧰 Tools
🪛 Ruff (0.8.2)

117-117: Do not use bare except

(E722)

🧹 Nitpick comments (133)
docs/make.bat (2)

23-24: Consider updating HTTP to HTTPS for security

The URL reference to Sphinx documentation uses HTTP. While this is likely copied from a standard Sphinx template, it's generally better to use HTTPS for security.

-	echo.http://sphinx-doc.org/
+	echo.https://sphinx-doc.org/

36-36: Remove empty line at end of file

There appears to be an empty line (line 36) at the end of the file that could be removed for cleanliness.

dpti/__init__.py (1)

1-16: Sorting of all List

The __all__ list is not in strict alphabetical order. Sorting it (for example, to
["dpti", "equi", "gdi", "hti", "hti_ice", "hti_liq", "hti_water", "lib", "ti", "ti_water"]) will improve readability and consistency when importing.

🧰 Tools
🪛 Ruff (0.8.2)

4-15: __all__ is not sorted

Apply an isort-style sorting to __all__

(RUF022)

conda/construct.yaml (1)

1-21: Usage of Jinja Templating in YAML

This configuration file uses Jinja templating to dynamically set the version (via a Git command). Keep in mind that standard YAML linters (such as YAMLlint) may flag the % character since they are unaware of the templating context. Ensure that your build process pre-renders this file before YAML validation.

🧰 Tools
🪛 YAMLlint (1.35.1)

[error] 1-1: syntax error: found character '%' that cannot start any token

(syntax)

.github/workflows/release.yml (1)

1-20: Release Workflow for PyPI

The workflow correctly triggers on push and pull_request events and conditions the publish step on tags. A couple of suggestions:

  • Instead of using actions/checkout@master, consider switching to a fixed version (e.g. actions/checkout@v3) for better stability.
  • Otherwise, the steps for setting up Python and publishing the package look solid.
conda/dpti/meta.yaml (2)

2-2: Handle absence of Git tags.
The version is derived from the command git describe --tags --abbrev=0. If no valid tags exist in the repository, this might fail or produce an empty version string. Consider adding a fallback or verifying that a tag always exists.


37-47: Add extended tests or smoke tests.
The config only checks that dpti can be imported. You might consider more thorough tests in the conda recipe's test section (e.g., running a minimal subcommand or checking functionality) to detect runtime errors.

dpti/einstein.py (2)

38-72: Edge-case handling in ideal_gas_fe.
The formula for the ideal gas free energy portion looks consistent for single-component systems. Consider adding checks for corner cases (e.g., zero or extremely small volume, zero temperature, or missing mass_map keys) to avoid runtime errors or physico-chemical inconsistencies.


204-219: Add command-line argument validation in _main.
Currently, _main assumes a valid JSON parameter file. Add error handling for missing files or invalid JSON structures. Also, consider providing an option to choose among ideal_gas_fe, free_energy, or frenkel from the CLI for more flexibility.

dpti/lib/dump.py (3)

13-25: Safeguard _get_block index searches.
The logic breaks once we find "ITEM: " + key, then attempts to find the next "ITEM: ". If the key never appears, or if it appears near EOF, the idx or idx_e might reference non-existent lines. Consider adding boundary checks for these scenarios.


60-82: get_posi assumes x, y, z columns exist.
You rely on the presence of x, y, z in head.split(). This can cause index errors if the dump format changes or if any custom column naming is used. A safer approach is to detect or parse the available columns and handle absent columns gracefully.


125-138: Naming the “atom_names” for each type.
Here, each type is labeled TYPE_0, TYPE_1, etc. Be mindful that your code uses these names in downstream modules or logs. If your simulation engine or user expects more descriptive labels (e.g., “O”, “H” in water), consider a more flexible naming strategy.

🧰 Tools
🪛 Ruff (0.8.2)

130-130: Use format specifiers instead of percent format

(UP031)

dpti/lib/RemoteJob.py (7)

60-67: Connection security.
paramiko defaults to standard SSH keys or password-based authentication. Ensure that ssh_client.connect() uses appropriate known-hosts checks or key-based authentication for production security.


79-89: Potential race condition in remote directory creation.
You unconditionally create the remote_root directory with sftp.mkdir. If multiple processes or the same job UUID is used twice, or if the directory already exists, you could get an unexpected error. Consider catching or handling that scenario.


115-125: Enhance error message clarity.
When block_checkcall fails, the raised RuntimeError message is missing the % placeholders’ expansion. Use a standard formatted string for clarity (f-string or .format). This also helps to incorporate exit codes properly.

- raise RuntimeError("Get error code %d ... %s ", (exit_status, self.job_uuid))
+ raise RuntimeError(f"Get error code {exit_status} in calling through ssh. Job ID: {self.job_uuid}")

139-152: Verbose removal logs.
_rmtree accepts a verbose flag but selectively prints only at the final removal. If you actually want step-by-step logs (removing directories, removing files), consider expanding the logging to show each removal step.


263-265: Use modern string formatting for mpirun command.
In _make_script, you’re applying %d format specifiers. Switch to f-strings or .format for clarity and consistency, especially within these HPC scripts.

🧰 Tools
🪛 Ruff (0.8.2)

263-263: Use format specifiers instead of percent format

(UP031)


301-303: Robust error handling in Slurm job state checks.
When squeue fails, you raise a RuntimeError with partial info. It may be helpful to include the entire stderr output or job ID for easier debugging.

🧰 Tools
🪛 Ruff (0.8.2)

301-301: Use format specifiers instead of percent format

(UP031)


440-442: Avoid partial string interpolation.
Same string-formatting note as in Slurm job (Ruff’s “UP031” warning). Using modern formatting in logs or error messages helps clarity and consistency.

🧰 Tools
🪛 Ruff (0.8.2)

440-440: Use format specifiers instead of percent format

(UP031)

dpti/lib/water.py (7)

3-9: Ensure consistent import grouping

The imports are separated with an empty line between os, sys and numpy, which is good for readability. However, you should consider grouping standard library imports (os, sys) and third-party imports (numpy) more explicitly, following PEP 8 guidelines.

import os
import sys

import numpy as np

95-104: Modernize string formatting with f-strings

Current code uses old-style percent formatting. Consider using f-strings for better readability and maintainability.

-            assert len(bonds[ii]) == 2, "ill defined O atom %d has H %s" % (
-                ii,
-                bonds[ii],
-            )
+            assert len(bonds[ii]) == 2, f"ill defined O atom {ii} has H {bonds[ii]}"

Similarly update the assert message for hydrogen atoms below.

🧰 Tools
🪛 Ruff (0.8.2)

96-96: Use format specifiers instead of percent format

(UP031)


101-101: Use format specifiers instead of percent format

(UP031)


129-132: Modernize string formatting with f-strings

Replace old-style percent formatting with f-strings for better readability.

-            ret_bd.append("%d 1 %d %d" % (idx, aidx[ii], aidx[bonds[ii][0]]))
+            ret_bd.append(f"{idx} 1 {aidx[ii]} {aidx[bonds[ii][0]]}")
-            ret_bd.append("%d 1 %d %d" % (idx, aidx[ii], aidx[bonds[ii][1]]))
+            ret_bd.append(f"{idx} 1 {aidx[ii]} {aidx[bonds[ii][1]]}")
🧰 Tools
🪛 Ruff (0.8.2)

129-129: Use format specifiers instead of percent format

(UP031)


131-131: Use format specifiers instead of percent format

(UP031)


138-141: Modernize string formatting with f-strings

Replace old-style percent formatting with f-strings for better readability.

-            ret_ang.append(
-                "%d 1 %d %d %d"
-                % (idx, aidx[bonds[ii][0]], aidx[ii], aidx[bonds[ii][1]])
-            )
+            ret_ang.append(
+                f"{idx} 1 {aidx[bonds[ii][0]]} {aidx[ii]} {aidx[bonds[ii][1]]}"
+            )
🧰 Tools
🪛 Ruff (0.8.2)

139-139: Use format specifiers instead of percent format

(UP031)


159-160: Modernize string formatting with f-strings

Replace old-style percent formatting with f-strings for better readability.

-    lines.insert(atoms_idx + 1, "%d angles" % nangles)
-    lines.insert(atoms_idx + 1, "%d bonds" % nbonds)
+    lines.insert(atoms_idx + 1, f"{nangles} angles")
+    lines.insert(atoms_idx + 1, f"{nbonds} bonds")
🧰 Tools
🪛 Ruff (0.8.2)

159-159: Use format specifiers instead of percent format

(UP031)


160-160: Use format specifiers instead of percent format

(UP031)


164-165: Modernize string formatting with f-strings

Replace old-style percent formatting with f-strings for better readability.

-    lines.insert(atoms_types_idx + 1, "%d angle types" % 1)
-    lines.insert(atoms_types_idx + 1, "%d bond types" % 1)
+    lines.insert(atoms_types_idx + 1, "1 angle types")
+    lines.insert(atoms_types_idx + 1, "1 bond types")
🧰 Tools
🪛 Ruff (0.8.2)

164-164: Use format specifiers instead of percent format

(UP031)


165-165: Use format specifiers instead of percent format

(UP031)


182-183: Modernize string formatting with f-strings

Replace old-style percent formatting with f-strings for better readability.

-        words.insert(1, "%d" % mole_idx[cc])
+        words.insert(1, f"{mole_idx[cc]}")
🧰 Tools
🪛 Ruff (0.8.2)

182-182: Use format specifiers instead of percent format

(UP031)

docs/install.md (1)

1-8: Fix grammatical error in installation instruction

There's a grammatical error in the installation instructions.

-DPTI can installed by `pip`:
+DPTI can be installed by `pip`:
🧰 Tools
🪛 LanguageTool

[grammar] ~4-~4: Did you mean “be installed”?
Context: ...all DPTI ==================== DPTI can installed by pip: bash pip install dpti ...

(WILL_BASED_ON)

.travis.yml (2)

2-4: Consider upgrading Python versions.

Python 3.6 reached end-of-life in December 2021. Consider upgrading to more recent versions like 3.8, 3.9, or 3.10 to ensure better compatibility and security.


8-9: Add package dependency installation.

The install section only checks the pip version without installing any dependencies. Consider adding the installation of required packages here.

install:
  - pip --version
+  - pip install -e .
docker/docker.sh (3)

1-3: Add shebang line and structure as executable script.

The script is missing a shebang line and is structured more as a reference than an executable script. Consider adding the shebang and making it executable if it's meant to be run directly.

-# useful docker scripts
-# build image: run in dpti/docker/ dir
-docker build .  --tag deepmodeling/dpti:latest
+#!/bin/bash
+# useful docker scripts
+
+# Function to build the docker image
+build_image() {
+  echo "Building docker image..."
+  docker build .  --tag deepmodeling/dpti:latest
+}
🧰 Tools
🪛 Shellcheck (0.10.0)

[error] 1-1: Tips depend on target shell and yours is unknown. Add a shebang or a 'shell' directive.

(SC2148)


5-6: Add container existence check.

Consider adding a check to see if a container with the name "dpti" already exists before attempting to create it.

-# start container called dpti
-docker run --name dpti -p 9999:8080 -it deepmodeling/dpti:latest /bin/bash
+# Function to start the container
+start_container() {
+  if [ ! "$(docker ps -a | grep dpti)" ]; then
+    echo "Starting new container..."
+    docker run --name dpti -p 9999:8080 -it deepmodeling/dpti:latest /bin/bash
+  else
+    echo "Container already exists. Use 'docker start dpti' to start it."
+    echo "To remove existing container: 'docker rm dpti'"
+  fi
+}

8-9: Add functions for better usability.

Structure the script with functions for each operation and add a simple CLI interface.

-# enter this container
-docker exec -it dpti /bin/bash
+# Function to enter the container
+enter_container() {
+  if [ "$(docker ps | grep dpti)" ]; then
+    echo "Entering container..."
+    docker exec -it dpti /bin/bash
+  else
+    echo "Container is not running. Start it first with 'docker start dpti'"
+  fi
+}
+
+# Simple CLI
+case "$1" in
+  build)
+    build_image
+    ;;
+  start)
+    start_container
+    ;;
+  enter)
+    enter_container
+    ;;
+  *)
+    echo "Usage: $0 {build|start|enter}"
+    exit 1
+    ;;
+esac
docker/entry.sh (2)

2-3: Add service start verification.

The script starts PostgreSQL but doesn't verify if it started successfully. Add error handling to ensure services start properly.

service postgresql start
-service postgresql status
+if ! service postgresql status > /dev/null; then
+  echo "Failed to start PostgreSQL service"
+  exit 1
+fi
+echo "PostgreSQL service started successfully"

12-15: Remove commented-out code.

There are several commented-out lines that appear to be remnants of previous implementations. Consider removing these if they're no longer needed to improve code clarity.

exec gosu airflow "$@"
-# su - airflow
-# whoami
-# echo "$@"
-# exec "$@"
dpti/dags/utils.py (2)

8-10: Remove commented-out code.

The commented-out line (# context = get_current_context()) should be removed if it's no longer needed.

def get_empty_submission(job_work_dir, context):
-    # context = get_current_context()
    dag_run = context["params"]
    work_base_dir = dag_run["work_base_dir"]

18-23: Remove unnecessary variable initialization.

The submission variable is initialized to None and then immediately assigned a new value, which is unnecessary.

-    submission = None
     submission = Submission(
         work_base=job_work_dir,
         resources=resources,
         machine=machine,
     )
docker/README.md (2)

1-42: Improve Markdown formatting for better readability

The document would benefit from proper Markdown formatting to enhance readability and structure.

Consider these improvements:

  1. Use proper Markdown headings (e.g., # Installation instead of ## docker installation:)
  2. Add syntax highlighting to code blocks (e.g., bash instead of just )
  3. Standardize section spacing
  4. Format the "note:" text as blockquotes using >

Example:

-## docker installation:
-see: https://docs.docker.com/engine/install/
+# Docker Installation
+See: https://docs.docker.com/engine/install/
🧰 Tools
🪛 LanguageTool

[grammar] ~17-~17: After the number ‘8080’, use a plural noun. Did you mean “ports”?
Context: ...s 'dpti' note: 9999:8000 means map 8080 port in the docker container to the 9999 por...

(CD_NNU)


33-37: Provide context for Airflow commands

The commands for starting Airflow services would benefit from additional context about their purpose and when they should be used.

Consider adding a brief explanation:

 ## command related to airflow

+# Start the Airflow web interface in the background
airflow webserver --hostname 0.0.0.0 --port 8080 &
+# Start the Airflow scheduler in the background
airflow scheduler &

.github/workflows/unittest.yml (3)

21-21: Enhance dependency installation

The current dependency installation might miss required packages for testing.

Consider expanding the dependencies installation to include test requirements:

-      run: pip install . coverage
+      run: |
+        python -m pip install --upgrade pip
+        pip install . coverage pytest pytest-cov
+        if [ -f requirements-dev.txt ]; then pip install -r requirements-dev.txt; fi

23-23: Improve test command readability and reliability

The current test command is complex and combines multiple operations in a single line, making it harder to debug if any step fails.

Consider breaking it into separate steps:

-      run: cd tests && coverage run --source=../dpti -m unittest && cd .. && coverage combine tests/.coverage && coverage report
+      run: |
+        cd tests
+        coverage run --source=../dpti -m unittest
+        cd ..
+        coverage combine tests/.coverage
+        coverage report

14-20: Add dependency caching

Adding caching for Python dependencies would speed up workflow runs.

Insert a caching step:

    - uses: actions/checkout@v4
    - name: Set up Python ${{ matrix.python-version }}
      uses: actions/setup-python@v5
      with:
        python-version: ${{ matrix.python-version }}
+    - name: Cache pip dependencies
+      uses: actions/cache@v3
+      with:
+        path: ~/.cache/pip
+        key: ${{ runner.os }}-pip-${{ hashFiles('**/requirements.txt', 'setup.py') }}
+        restore-keys: |
+          ${{ runner.os }}-pip-
    - name: Install dependencies
.cursor/rules/dpti-default-ai-rules.mdc (3)

15-15: Fix typo in line 15

There's an accidental # character in the middle of a word.

Apply this correction:

-你的目标是以用户容易理解的方式帮助他们完成分子动力学自由能计算应用的设计和开发工作,确保计算结果准确可靠、性能高#效、用户使用便捷。
+你的目标是以用户容易理解的方式帮助他们完成分子动力学自由能计算应用的设计和开发工作,确保计算结果准确可靠、性能高效、用户使用便捷。

61-61: Fix numbering sequence

The numbering sequence in the methodology section is incorrect.

-2.回答问题时,也给出关于设计模式上的思考,提示用户相关联的最佳实践(如果有)。
+4.回答问题时,也给出关于设计模式上的思考,提示用户相关联的最佳实践(如果有)。

7-9: Remove commented hints

These commented lines appear to be leftover hints for document formatting.

-<!-- - You can @ files here
-- You can use markdown but dont have to -->
dpti/main.py (4)

5-5: Remove redundant commented import

The gdi module is already imported in line 3 but appears again as a commented import in line 5.

-# from . import gdi

8-29: Add docstring to create_parser function

Adding a docstring would improve code documentation and help developers understand the purpose and usage of this function.

 def create_parser():
+    """
+    Create and configure the command-line argument parser for DPTI.
+    
+    Returns:
+        argparse.ArgumentParser: Configured parser with all module subparsers.
+    """
     parser = argparse.ArgumentParser(
         description="DPTI: An Automatic Workflow Software for Thermodynamic Integration Calculations"
     )

32-38: Add error handling to main function

The function execution in line 36 lacks error handling, which could result in unhandled exceptions and poor user experience.

 def main():
+    """
+    Main entry point for the DPTI command-line interface.
+    Parses arguments and executes the appropriate function.
+    """
     parser = create_parser()
     args = parser.parse_args()
     if hasattr(args, "func"):
-        args.func(args)
+        try:
+            args.func(args)
+        except Exception as e:
+            print(f"Error executing command: {e}")
+            import sys
+            sys.exit(1)
     else:
         parser.print_help()

35-36: Consider adding a verbose mode flag

Adding a verbose mode would help with debugging and provide users with more information about what's happening.

Add a global verbose flag to the parser and use it when executing functions:

 def create_parser():
     parser = argparse.ArgumentParser(
         description="DPTI: An Automatic Workflow Software for Thermodynamic Integration Calculations"
     )
+    parser.add_argument('-v', '--verbose', action='store_true', help='Enable verbose output')
     main_subparsers = parser.add_subparsers(

Then in the main function:

     if hasattr(args, "func"):
-        args.func(args)
+        try:
+            if args.verbose:
+                print(f"Executing {args.module} command...")
+            args.func(args)
+        except Exception as e:
dpti/lib/ovito_file_convert.py (1)

24-39: Output format inference and validation are well-handled.

Raising a RuntimeError for unsupported formats is appropriate, preventing silent failures. Consider providing actionable error messages or usage hints if you want to be more user-friendly in the future.

dpti/main.tex (1)

34-36: Refine the document title for clarity.

Currently the title lines read "Write down" which seems like a placeholder. Consider providing a more descriptive title that reflects the content of this paper.

-\title{
-Write down
-}
+\title{
The Ideal Einstein Crystal with COM Constraints
}
dpti/dags/dp_ti_gdi.py (1)

199-226: Triggering sub-DAG runs is well done, but you may want to handle concurrency or parallel triggers.

Currently, if multiple submissions share the same submission_hash, collisions could occur. Consider adding logic to handle or differentiate repeated triggers.

dpti/lib/lammps.py (2)

6-13: Consider using a more efficient file reading approach

This function loads the entire file into memory at once, which could be problematic for large LAMMPS files. Consider reading the file line by line until the needed information is found.

def get_natoms(filename):
-    with open(filename) as fp:
-        lines = fp.read().split("\n")
-    for ii in lines:
-        if "atoms" in ii:
-            natoms = int(ii.split()[0])
-            return natoms
+    with open(filename) as fp:
+        for line in fp:
+            if "atoms" in line:
+                natoms = int(line.split()[0])
+                return natoms
    raise RuntimeError("cannot find key word 'atoms' in " + filename)

63-73: Optimize the search for the last timestep

This function inefficiently searches for the last occurrence of "ITEM: TIMESTEP" by iterating through the entire file. Consider using a more efficient approach.

def get_last_dump(dump):
    with open(dump) as fp:
        lines = fp.read().split("\n")
    lines = lines[:-1]
-    step_idx = -1
-    for idx, ii in enumerate(lines):
-        if "ITEM: TIMESTEP" in ii:
-            step_idx = idx
+    # Reverse search for the last occurrence of "ITEM: TIMESTEP"
+    step_idx = -1
+    for idx in range(len(lines) - 1, -1, -1):
+        if "ITEM: TIMESTEP" in lines[idx]:
+            step_idx = idx
+            break
    if step_idx == -1:
        raise RuntimeError("cannot find timestep in lammps dump, something wrong")
    return "\n".join(lines[step_idx:])
docs/conf.py (1)

22-24: Use f-strings instead of percent formatting

The static analysis tool suggests using format specifiers instead of percent formatting. Consider using f-strings for a more modern Python approach.

project = "DPTI"
-copyright = "2021-%d, Deep Modeling" % date.today().year
+copyright = f"2021-{date.today().year}, Deep Modeling"
author = "DeepModeling"
🧰 Tools
🪛 Ruff (0.8.2)

23-23: Use format specifiers instead of percent format

(UP031)

.github/workflows/release_conda_and_github.yml (2)

25-27: Consider consolidating Conda commands

These commands could be consolidated to reduce complexity and improve readability of the workflow.

-    - run: conda install constructor jinja2 conda-build -c deepmodeling -c conda-forge -y
-    - run: cd ./conda/dpti/ && conda build . -c deepmodeling -c conda-forge && cd ../../
-    - run: constructor ./conda/
+    - name: Build and construct Conda package
+      run: |
+        conda install constructor jinja2 conda-build -c deepmodeling -c conda-forge -y
+        cd ./conda/dpti/ && conda build . -c deepmodeling -c conda-forge && cd ../../
+        constructor ./conda/

28-34: Consider using github.repository context variable

Instead of using ${{ env.GITHUB_REPOSITORY }}, consider using the github.repository context variable which is more standard in GitHub Actions workflows.

    - name: Upload to release
      uses: softprops/action-gh-release@master
      with:
        files: dpti-*.sh
-        repository: ${{ env.GITHUB_REPOSITORY }}
+        repository: ${{ github.repository }}
      env:
        GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
README.md (3)

54-127: Remove trailing punctuation in headings for better Markdown style.

Multiple headings in this README end with punctuation (e.g., colons, periods). Markdown lint tools typically recommend removing punctuation at the end of headings to enhance readability and adhere to common conventions.

Example fixes in headings:

-## for CLI tools:
+## for CLI tools

-### website to manage and monitor these jobs.
+### website to manage and monitor these jobs

-# Installation:
+# Installation

Also applies to: 143-155, 176-176, 191-202, 218-218, 237-237, 329-329, 341-350, 374-374, 389-397, 639-639, 702-721

🧰 Tools
🪛 LanguageTool

[uncategorized] ~82-~82: The preposition ‘of’ seems more likely in this position.
Context: ...ion files ### HTI This is an example for HTI three-step simulations. ``` cd exa...

(AI_HYDRA_LEO_REPLACE_FOR_OF)


[style] ~92-~92: Consider a shorter alternative to avoid wordiness.
Context: ...step ``` ### TI For temperature 200K, in order to generate integration path pressure from...

(IN_ORDER_TO_PREMIUM)


[style] ~107-~107: Consider a shorter alternative to avoid wordiness.
Context: ...examples/ti/ dpti ti gen ti.t.json ``` In order to generate TI path changing temperature, ...

(IN_ORDER_TO_PREMIUM)

🪛 markdownlint-cli2 (0.17.2)

54-54: Trailing punctuation in heading
Punctuation: ':'

(MD026, no-trailing-punctuation)


58-58: Trailing punctuation in heading
Punctuation: ':'

(MD026, no-trailing-punctuation)


63-63: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)


71-71: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)


84-84: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)


103-103: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)


109-109: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)


121-121: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)


127-127: Trailing punctuation in heading
Punctuation: ':'

(MD026, no-trailing-punctuation)


82-82: Simplify “in order to” usage.

Multiple occurrences of “in order to” can be shortened to “to” for concise writing.

For instance, line 92 says:

-For temperature 200K, in order to generate integration path pressure...
+For temperature 200K, to generate integration path pressure...

Also applies to: 92-92, 107-107

🧰 Tools
🪛 LanguageTool

[uncategorized] ~82-~82: The preposition ‘of’ seems more likely in this position.
Context: ...ion files ### HTI This is an example for HTI three-step simulations. ``` cd exa...

(AI_HYDRA_LEO_REPLACE_FOR_OF)


29-29: Recheck indefinite article usage with uncountable nouns.

“a Deep Potential Water” is uncountable usage; consider omitting the article or rewording.

Example:

-... Phase Diagram of a Deep Potential Water Model ...
+... Phase Diagram of Deep Potential Water Model ...
🧰 Tools
🪛 LanguageTool

[grammar] ~29-~29: Uncountable nouns are usually not used with an indefinite article. Use simply “Deep Potential Water”.
Context: ...ter and ice see: [PRL:Phase Diagram of a Deep Potential Water Model](https://journals.aps.org/prl/abs...

(A_UNCOUNTABLE)

dpti/workflows/flows/flow_settings.py (3)

11-11: Use type instead of typing.Type for type annotations.

The modern recommendation is to replace typing.Type[...] with the built-in type[...]. This aligns with updated typing conventions and avoids deprecation warnings.

-from typing import Any, Type, TypeVar
+from typing import Any, TypeVar
 ...
-def parse_yaml_like_input(value: Any, model_class: Type[unvalid_BaseModel_T]) -> unvalid_BaseModel_T:
+def parse_yaml_like_input(value: Any, model_class: type[unvalid_BaseModel_T]) -> unvalid_BaseModel_T:

Also applies to: 23-23

🧰 Tools
🪛 Ruff (0.8.2)

11-11: typing.Type is deprecated, use type instead

(UP035)


83-83: Use isinstance instead of type comparison.

Avoid direct type checks such as type(...) == something; either use isinstance(...) or compare the class if exactly required. This is a more Pythonic practice and recommended by static analysis.

- and getattr(field_type, "__origin__", None) != dict
+ and not (getattr(field_type, "__origin__", None) is dict)

Or consider an explicit isinstance(field_value, dict) approach if that clarifies your intent.

🧰 Tools
🪛 Ruff (0.8.2)

83-83: Use is and is not for type comparisons, or isinstance() for isinstance checks

(E721)


91-91: Replace ambiguous fullwidth comma with a regular comma.

Use ASCII commas to avoid potential encoding or parsing issues.

-            # 未知字段,直接保留
+            # 未知字段, 直接保留
🧰 Tools
🪛 Ruff (0.8.2)

91-91: Comment contains ambiguous (FULLWIDTH COMMA). Did you mean , (COMMA)?

(RUF003)

dpti/workflows/service/di.py (1)

10-10: Annotate mutable class attributes with typing.ClassVar.

_contexts is a mutable class attribute that should be annotated with ClassVar for clarity and to avoid unintended instance-level modifications.

-class InjectionContext:
-    _contexts = []
+from typing import ClassVar, List
+
+class InjectionContext:
+    _contexts: ClassVar[List["InjectionContext"]] = []
🧰 Tools
🪛 Ruff (0.8.2)

10-10: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

dpti/workflows/prefect/prefect_task_hash.py (2)

16-16: *Consider removing the type hint from args if not strictly necessary.

Union[str, bytes] for *args can be flexible, but if you require both, ensure your docstrings or usage reflect it. Otherwise, accept any type and handle it gracefully.


35-35: Remove or refactor debugging print statements.

Large amounts of debugging prints (like print(f"stable_hash: end: {hash=} {args=}")) can clutter logs. Consider using a logger with a debug level.

dpti/workflows/examples/Sn_SCAN_high_pressure/ti_line_long_liquid_1200K_10GPa_path_t_settings.yaml (1)

14-45: Consider clarifying null parameters.

Parameters like temp: null or if_meam: null might be placeholders that rely on an upstream default. Ensure this is documented or validated to avoid runtime confusion.

dpti/workflows/flows/base_flow.py (4)

1-1: Remove commented-out magic comment.

# %% is a Jupyter or Editor marker. If not in use, consider removing to keep the file cleaner.


8-8: Avoid pushing commented-out sys.path insert.

You have another sys.path.insert logic below. Clear out commented lines to reduce confusion about which path is used/ignored.


20-22: Refactor path join usage to avoid string concatenation.

Use os.path.join(os.path.dirname(__file__), "..", "..", "..") rather than ("../" "../", "../").


131-131: Check concurrency with the flow_start_check method.

If multiple flows run concurrently, confirm no collisions in shared directories. Possibly add concurrency control or locking.

dpti/workflows/service/job_executor.py (1)

37-47: Improve the error messages.

When raising errors for missing job_dir, clarify next steps or suggestion, e.g. "Please specify existing directory for job_dir" or log the instructions for users.

dpti/gdi.py (1)

244-245: Use f-strings instead of older %-style formatting.

Lines flagged by static analysis suggest modernizing string formatting for readability and maintainability. Consider switching to f-strings throughout these lines.

- print("# dpdt: do not find any matched record, run new task from %d "
-       % counter)
+ print(f"# dpdt: do not find any matched record, run new task from {counter} ")

- print("# dpdt: cannot find any MD record, start from scrtach")
+ print(f"# dpdt: cannot find any MD record, start from scratch")

- print("# dpdt: found MD records, search if any record matches")
+ print(f"# dpdt: found MD records, searching for matches")

Also applies to: 254-254, 261-261

🧰 Tools
🪛 Ruff (0.8.2)

244-244: Use format specifiers instead of percent format

(UP031)


245-245: Use format specifiers instead of percent format

(UP031)

dpti/relax.py (2)

15-16: Convert %-style formatting to f-strings.

To comply with static analysis recommendations, replace older %-style string formatting with f-strings for enhanced clarity and consistency.

- ret += "variable        THERMO_FREQ     equal %d\n" % thermo_freq
- ret += "variable        DUMP_FREQ       equal %d\n" % dump_freq
- ret += "mass            %d %f\n" % (jj + 1, mass_map[jj])
+ ret += f"variable        THERMO_FREQ     equal {thermo_freq}\n"
+ ret += f"variable        DUMP_FREQ       equal {dump_freq}\n"
+ ret += f"mass            {jj + 1} {mass_map[jj]}\n"

Also applies to: 27-27

🧰 Tools
🪛 Ruff (0.8.2)

15-15: Use format specifiers instead of percent format

(UP031)


16-16: Use format specifiers instead of percent format

(UP031)


127-129: Use a proper exit call.

Calling exit without parentheses is a no-op in Python. If you intend to stop execution, replace it with sys.exit(0) or exit().

if args.command is None:
    parser.print_help()
-    exit
+    exit()
dpti/hti_water.py (8)

29-55: Evaluate the naming consistency in _ff_angle_on.
The function _ff_angle_on is well-defined, but consider making the function name more descriptive, such as _ff_enable_angle, aligning with _ff_deep_on and _ff_bond_angle_off to keep naming patterns consistent.


58-85: Add docstring or comments explaining _ff_deep_on.
While _ff_deep_on appears to overlay a Deep Potential with LJ interactions, it's not immediately clear how each parameter (model, bond constants, etc.) influences the final pipeline. A short docstring clarifying the function's purpose and parameter usage will improve readability.


143-145: Use f-strings or .format() instead of old-style % formatting.
According to static analysis hints, lines 143, 144, 145, 159, 162, 190, and 256 each use % for string formatting. Modern formatting is often more readable (e.g., f"variable NSTEPS equal {nsteps}").

Apply a diff similar to:

- ret += "variable        NSTEPS          equal %d\n" % nsteps
+ ret += f"variable        NSTEPS          equal {nsteps}"

Also applies to: 159-159, 162-162, 190-190, 256-256

🧰 Tools
🪛 Ruff (0.8.2)

143-143: Use format specifiers instead of percent format

(UP031)


144-144: Use format specifiers instead of percent format

(UP031)


145-145: Use format specifiers instead of percent format

(UP031)


201-285: Handle missing keys in _make_tasks.
Currently, _make_tasks uses items from jdata (like "lambda_angle_on", "protect_eps", etc.) without fallback handling. If the JSON lacks these keys, it raises a KeyError. Consider adding robust error handling or default values to prevent unexpected crashes.

🧰 Tools
🪛 Ruff (0.8.2)

256-256: Use format specifiers instead of percent format

(UP031)


287-355: Validate saved data in _refine_tasks.
When copying files (e.g., data, log.lammps) from the original tasks to refined tasks, ensure each target file is valid. If a file is missing or empty (for instance, after a crashed run), this routine could silently produce incomplete tasks. A more explicit check or a fallback behavior might reduce debugging time.


413-435: Consider returning structured results in _compute_thermo.
Right now _compute_thermo returns a dict of thermodynamic averages and errors. This structure is good, but it might be clearer to define a small data class or NamedTuple for strongly typed usage across the codebase, improving type safety and clarity.


536-599: Ensure pymbar.MBAR usage is tested for corner cases in _post_tasks_mbar.
After collecting potential energies (ukn) in the loop, we rely on pymbar.MBAR(ukn, nk). If nk has all zeros or if the temperature is out of range, MBAR might fail. Suggest adding validation checks or error handling to clarify potential dimension mismatches or empty data arrays.


703-717: Improve user feedback in _main.
If the user doesn’t specify a valid subcommand, the code calls parser.print_help(), but we might also want to log or display a custom message clarifying the error for new users.

dpti/hti_ice.py (4)

14-28: Add docstring to _main explaining the "ice" scope.
The _main function sets up an argparse flow for Hamiltonian TI on ice. Adding a short docstring clarifies that these subcommands focus on structural/phase transitions specifically relevant to ice-based studies.


30-35: Revisit exec_args fallback logic.
Currently, if args lacks func, the script prints help. Consider elaborating whether that scenario is truly unexpected or if it’s valid to have subcommands that do not define a func. An explicit message or an exception might help clarify usage.


50-70: Explain "switch" parameter in add_subparsers.
The --switch argument offers three modes for the switching approach, but the help message is somewhat compressed. Consider clarifying the differences (one-step vs. two-step vs. three-step) with bullet points or additional lines for better readability.


181-278: Document partial disorder corrections inside handle_compute.
When args.disorder_corr is true, partial or full disorder corrections (ice3, ice5, or default) are applied. However, this logic is easy to miss if a user is not aware. A docstring or inline comment can help connect the physical meaning of these corrections to the code.

dpti/workflows/analyze/compare_ti_plot.py (4)

22-39: Clarify dictionary structure for phase1_result and phase2_result.
plot_free_energy_comparison expects nested dictionaries with keys like ["data"]["all_temps"]. If the structure differs or keys are missing, the function breaks. Consider adding parameter checks or a more descriptive docstring.


332-333: Migrate from typing.Dict/List to builtin dict/list.
Lines 332 and beyond define type hints with Dict and List. The static analysis suggests using native dict and list from Python 3.9+ rather than the legacy typing imports.

Also applies to: 335-335, 336-337, 340-340

🧰 Tools
🪛 Ruff (0.8.2)

332-332: typing.Dict is deprecated, use dict instead

(UP035)


332-332: typing.List is deprecated, use list instead

(UP035)


357-358: Enhance maintainability by grouping imports.
Currently, the function plot_multiple_phase_comparison re-imports modules (e.g., matplotlib.pyplot and numpy). These are already imported globally. Re-importing can be avoided for clarity and consistency in a single top-level import location.


399-399: Replace fullwidth parentheses and commas with standard ASCII symbols.
Lines 399, 416, 434, 465, and 473 contain fullwidth parentheses/comma which can cause confusion in certain editors or on certain locales:

(  )  ,

Switching them to standard ASCII parentheses/comma ensures greater compatibility.

Also applies to: 416-416, 434-434, 465-473

🧰 Tools
🪛 Ruff (0.8.2)

399-399: Comment contains ambiguous (FULLWIDTH LEFT PARENTHESIS). Did you mean ( (LEFT PARENTHESIS)?

(RUF003)


399-399: Comment contains ambiguous (FULLWIDTH RIGHT PARENTHESIS). Did you mean ) (RIGHT PARENTHESIS)?

(RUF003)

dpti/workflows/flows/free_energy_ti_line.py (2)

6-11: Migrate from typing.Dict to Python's built-in dict.

Certain lines use typing.Dict, which is flagged as deprecated in favor of the built-in dict. Updating these types promotes modern Python practices and ensures forward compatibility.

Example diff for lines referencing Dict:

-from typing import (
-    Any,
-    Dict,
-    Optional,
-    Union,
-)
+from typing import (
+    Any,
+    Optional,
+    Union,
+)

And similarly, adjust return types:

-def load_default_yaml_like_asdict() -> Dict[str, Any]:
+def load_default_yaml_like_asdict() -> dict[str, Any]:

Also applies to: 77-77, 97-97, 165-165

🧰 Tools
🪛 Ruff (0.8.2)

6-11: typing.Dict is deprecated, use dict instead

(UP035)


88-88: Remove or replace direct debug print statement.

A raw print() can clutter logs. Consider using a configurable logging mechanism or removing it when running in production to keep logs clean.

dpti/mti.py (2)

53-56: Use modern string formatting methods.

Many lines use % for string formatting. Updating to f-strings or .format() can enhance readability and align with contemporary Python code style.

Example f-string usage:

-ret += "variable        ibead           uloop %d pad\n" % (power - 1)
+ret += f"variable        ibead           uloop {power - 1} pad\n"

Also applies to: 70-70, 72-72, 197-197, 210-210, 215-215, 266-266

🧰 Tools
🪛 Ruff (0.8.2)

53-53: Use format specifiers instead of percent format

(UP031)


54-54: Use format specifiers instead of percent format

(UP031)


55-55: Use format specifiers instead of percent format

(UP031)


56-56: Use format specifiers instead of percent format

(UP031)


328-328: Avoid shell-specific symlink commands for cross-platform support.

Using "ln -s" is Unix-specific. To ensure portability, consider Python’s built-in os.symlink (with an appropriate fallback for non-Unix environments).

dpti/hti_liq.py (1)

28-28: Prefer f-strings or str.format() over % formatting.

This codebase frequently uses % for formatting, which can be replaced with more modern, readable f-strings.

Example:

- ptr += "# E (err)  [eV]:  {:20.8f} {:20.8f}\n".format(info["e"], info["e_err"])
+ ptr += f"# E (err)  [eV]:  {info['e']:20.8f} {info['e_err']:20.8f}\n"

Also applies to: 179-181, 196-196, 199-199, 231-231, 313-313, 516-516

🧰 Tools
🪛 Ruff (0.8.2)

28-28: Use format specifiers instead of percent format

(UP031)

dpti/hti.py (2)

71-121: Consider modularizing repeated code in _ff_deep_on.
This function shares several steps with _ff_lj_on and _ff_lj_off, such as the computation of sigma_key_index and setting up pair coefficients. Extracting these shared steps into a helper function can reduce duplication and improve maintainability.


902-924: Use f-strings instead of percent formatting.
Static analysis flagged multiple lines across the file for using % formatting (e.g., lines 346, 347, etc.). Updating these to f-strings or .format will improve readability and conform to the suggested style.

- ret += 'thermo_style    custom step ke pe etotal enthalpy temp press vol v_l_spring c_e_deep c_allmsd[*]\n'
+ ret += f"thermo_style    custom step ke pe etotal enthalpy temp press vol v_l_spring c_e_deep c_allmsd[*]\n"
dpti/lib/lmp.py (2)

55-57: Handle missing “Atoms” section gracefully in get_atoms.
If a malformed file or incomplete lines are provided, _get_block may return None. Consider returning an empty list instead of None to avoid errors in downstream processing.


172-199: Fix static analysis warnings for format strings.
Lines like 177, 178, and 191 use percent formatting. Updating them to f-strings or .format would adhere to the style suggested by the linting tool.

- ret += "%d atoms\n" % natoms
+ ret += f"{natoms} atoms\n"
🧰 Tools
🪛 Ruff (0.8.2)

177-177: Use format specifiers instead of percent format

(UP031)


178-178: Use format specifiers instead of percent format

(UP031)


191-191: Use format specifiers instead of percent format

(UP031)

dpti/workflows/simulations/equi_sim.py (3)

20-20: Avoid redefinition warnings for imported entities.
SettingsBase, SimulationBase, and workflow_task are re-imported (line 20) after being imported on line 16. Safe removal or consolidation of these imports can resolve warnings (F811) from static analysis.

🧰 Tools
🪛 Ruff (0.8.2)

20-20: Redefinition of unused SettingsBase from line 16

(F811)


20-20: Redefinition of unused SimulationBase from line 16

(F811)


20-20: Redefinition of unused workflow_task from line 16

(F811)


88-89: Adopt modern type hints.
You are using Dict[str, Any] for node_upstream_data and node_result. Current Python style suggests using dict[str, Any] instead. This aligns with the static analysis advice and future-proofs the hints.

- node_upstream_data: Optional[Dict[str, Any]] = None
- node_result: Dict[str, Any] = Field(default_factory=dict)
+ node_upstream_data: Optional[dict[str, Any]] = None
+ node_result: dict[str, Any] = Field(default_factory=dict)
🧰 Tools
🪛 Ruff (0.8.2)

88-88: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


89-89: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


297-326: Double-check that NVT is enforced in _prepare.
The code checks ens == "nvt" before generating LAMMPS input. Consider guiding the user with a more descriptive error or fallback if the ensemble setting is incorrect, rather than only raising a ValueError.

dpti/ti.py (8)

32-32: Switch to an f-string for consistency and readability.
The use of old-style string formatting with "%04d" % iter_index can be replaced by a more modern and readable f-string.

-def make_iter_name(iter_index):
-    return "task_ti." + ("%04d" % iter_index)
+def make_iter_name(iter_index):
+    return f"task_ti.{iter_index:04d}"
🧰 Tools
🪛 Ruff (0.8.2)

32-32: Use format specifiers instead of percent format

(UP031)


68-70: Use f-strings to replace %d for consistent string formatting.
Various lines here rely on % formatting. Adopting f-strings can improve readability and consistency throughout the script.

- ret += "variable        NSTEPS          equal %d\n" % nsteps
- ret += "variable        THERMO_FREQ     equal %d\n" % thermo_freq
- ret += "variable        DUMP_FREQ       equal %d\n" % dump_freq
+ ret += f"variable        NSTEPS          equal {nsteps}\n"
+ ret += f"variable        THERMO_FREQ     equal {thermo_freq}\n"
+ ret += f"variable        DUMP_FREQ       equal {dump_freq}\n"
🧰 Tools
🪛 Ruff (0.8.2)

68-68: Use format specifiers instead of percent format

(UP031)


69-69: Use format specifiers instead of percent format

(UP031)


70-70: Use format specifiers instead of percent format

(UP031)


134-136: Convert the velocity creation seed to an f-string.
This avoids mixing lines of code for string formatting and enhances clarity.

-ret += "velocity        all create ${TEMP} %d\n" % (
-    np.random.default_rng().integers(1, 2**16)
-)
+random_seed = np.random.default_rng().integers(1, 2**16)
+ret += f"velocity        all create ${{TEMP}} {random_seed}\n"
🧰 Tools
🪛 Ruff (0.8.2)

134-134: Use format specifiers instead of percent format

(UP031)


245-245: Replace old-style format with an f-string for clarity.

-task_dir = os.path.join(job_abs_dir, "task.%06d" % ii)
+task_dir = os.path.join(job_abs_dir, f"task.{ii:06d}")
🧰 Tools
🪛 Ruff (0.8.2)

245-245: Use format specifiers instead of percent format

(UP031)


494-494: Favor f-string over old-style formatting in the print statement.

-print("# natoms: %d" % natoms)
+print(f"# natoms: {natoms}")
🧰 Tools
🪛 Ruff (0.8.2)

494-494: Use format specifiers instead of percent format

(UP031)


630-633: Use f-strings when printing to maintain consistency and readability.
These print statements and result accumulations use old-style formatting with several placeholders. Switching to f-strings simplifies them.

- print("#%8s  %20s  %9s  %9s  %9s" % ("T(ctrl)", "F", "stat_err", "inte_err", "err"))
- result += "#%8s  %20s  %9s  %9s  %9s\n" % ("T(ctrl)", "F", "stat_err", "inte_err", "err")
- print("#%8s  %15s  %20s  %9s  %9s  %9s" % ("T(ctrl)", "P(ctrl)", "F", "stat_err", "inte_err", "err"))
- ...
- print("# natoms: %d" % natoms)  # line 717
- print("#%8s  %15s  %9s  %9s" % ("T(ctrl)", "F", "stat_err", "inte_err"))
- ...
+ print(f"#{'T(ctrl)':>8}  {'F':>20}  {'stat_err':>9}  {'inte_err':>9}  {'err':>9}")
+ result += f"#{'T(ctrl)':>8}  {'F':>20}  {'stat_err':>9}  {'inte_err':>9}  {'err':>9}\n"
+ print(f"#{'T(ctrl)':>8}  {'P(ctrl)':>15}  {'F':>20}  {'stat_err':>9}  {'inte_err':>9}  {'err':>9}")
+ ...
+ print(f"# natoms: {natoms}")
+ print(f"#{'T(ctrl)':>8}  {'F':>15}  {'stat_err':>9}  {'inte_err':>9}")
+ ...

Also applies to: 647-650, 717-717, 801-801, 808-808

🧰 Tools
🪛 Ruff (0.8.2)

630-630: Use format specifiers instead of percent format

(UP031)


633-633: Use format specifiers instead of percent format

(UP031)


341-363: Extract and reuse thermodynamic computations to avoid duplicate logic.
You have _compute_thermo in this file and a similar function in old_equi.py. Consider creating a shared utility module to streamline maintenance and testing.

Also applies to: 285-304


449-685: Consider splitting post_tasks and post_tasks_mbar into smaller, more focused functions.
Both functions handle multiple responsibilities (disk I/O, data parsing, block averaging, printing results, etc.) leading to lengthy code. Splitting them improves readability and maintainability.

🧰 Tools
🪛 Ruff (0.8.2)

494-494: Use format specifiers instead of percent format

(UP031)


630-630: Use format specifiers instead of percent format

(UP031)


633-633: Use format specifiers instead of percent format

(UP031)


647-647: Use format specifiers instead of percent format

(UP031)


650-650: Use format specifiers instead of percent format

(UP031)

dpti/old_equi.py (3)

46-48: Switch to f-strings in these lines.
The static analysis hints indicate old-style %d or %f usage. While some lines may be false positives, confirming any existing % formatting is replaced with f-strings ensures consistency across the codebase.

Also applies to: 64-64, 106-106, 264-264, 281-281, 346-346

🧰 Tools
🪛 Ruff (0.8.2)

46-46: Use format specifiers instead of percent format

(UP031)


47-47: Use format specifiers instead of percent format

(UP031)


48-48: Use format specifiers instead of percent format

(UP031)


285-343: Consolidate _compute_thermo logic.
As noted in ti.py, _compute_thermo also appears here. Extracting it into a shared utility can reduce duplication and avoid potential drift between two implementations.


373-394: Enhance error handling for missing keys.
When you read j_file and look for "copies" or "is_water", consider graceful fallbacks or clearer error messages if they are absent. This can reduce potential crashes if the user forgets to define them.

dpti/workflows/analyze/build_gibbs_surface.py (4)

6-6: Use standard Python 3 built-ins for type hints.
The hints suggest switching from typing.Dict, typing.List, typing.Tuple to dict, list, and tuple to align with Python’s updated typing conventions.

-from typing import Dict, List, Optional, Tuple
+from typing import Optional
 ...
-def load_result_file(self, file_path: str, phase_name: Optional[str] = None) -> Dict:
+def load_result_file(self, file_path: str, phase_name: Optional[str] = None) -> dict:
 ...

Also applies to: 26-26, 82-82, 107-110, 272-273, 450-451

🧰 Tools
🪛 Ruff (0.8.2)

6-6: typing.Dict is deprecated, use dict instead

(UP035)


6-6: typing.List is deprecated, use list instead

(UP035)


6-6: typing.Tuple is deprecated, use tuple instead

(UP035)


456-456: Replace fullwidth punctuation with standard ASCII equivalents.
Some docstrings or comments use characters like (, ), ,, :. Converting them to (, ), ,, : avoids potential parsing or rendering issues.

Also applies to: 533-533, 540-540, 557-559, 760-760

🧰 Tools
🪛 Ruff (0.8.2)

456-456: Docstring contains ambiguous (FULLWIDTH LEFT PARENTHESIS). Did you mean ( (LEFT PARENTHESIS)?

(RUF002)


456-456: Docstring contains ambiguous (FULLWIDTH RIGHT PARENTHESIS). Did you mean ) (RIGHT PARENTHESIS)?

(RUF002)


268-420: Consider performance fallback or error handling for griddata(…, method="cubic").
Cubic interpolation can fail or be slow if data points are sparse or in degenerate configurations. Adding a try-fallback to method="linear" or "nearest" can enhance robustness.

🧰 Tools
🪛 Ruff (0.8.2)

272-272: Use tuple instead of Tuple for type annotation

Replace with tuple

(UP006)


273-273: Use tuple instead of Tuple for type annotation

Replace with tuple

(UP006)


455-527: Avoid in-code plt.show() for library usage patterns.
Line 520 calls plt.show(), which is typically better suited for interactive usage. Libraries or production code often skip direct plt.show() to avoid unexpected blocking in scripts.

🧰 Tools
🪛 Ruff (0.8.2)

456-456: Docstring contains ambiguous (FULLWIDTH LEFT PARENTHESIS). Did you mean ( (LEFT PARENTHESIS)?

(RUF002)


456-456: Docstring contains ambiguous (FULLWIDTH RIGHT PARENTHESIS). Did you mean ) (RIGHT PARENTHESIS)?

(RUF002)

dpti/workflows/simulations/base.py (2)

40-41: Adopt modern built-in generics for type hints.

Python 3.9+ encourages using built-in types like list, dict, and type instead of List, Dict, and Type. This helps keep the codebase aligned with modern standards and avoids potential deprecation issues.

Below is an example diff to replace some occurrences (non-exhaustive):

- from typing import Dict, List, Optional, Type, Union
+ from typing import Optional, Union
...
- def some_function(inputs: List[int]) -> Dict[str, Any]:
+ def some_function(inputs: list[int]) -> dict[str, Any]:

- nodedata_type: Type[NodeDataType]
+ nodedata_type: type[NodeDataType]

- skip_steps: List[str]
+ skip_steps: list[str]

- prev_results: Optional[Dict[str, Any]] = None
+ prev_results: Optional[dict[str, Any]] = None

Also applies to: 66-67, 73-73, 190-191, 199-199, 201-201, 207-208, 282-282, 297-298, 316-318, 373-373

🧰 Tools
🪛 Ruff (0.8.2)

40-40: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


154-155: Replace prints with structured logging.

Throughout the file, multiple print statements report the simulation’s progress. For production workflows—especially those involving Prefect—using a logger (e.g. the Prefect logger or logging library) offers more flexible and standardized output, making it easier to filter, format, or redirect logs.

You could do something like:

from prefect import get_run_logger

logger = get_run_logger()
logger.info("note: is going to execute job: %s", self)

Also applies to: 299-312

dpti/workflows/service/workflow_service_module.py (1)

71-77: Optional: Add docstrings for provider methods.

Defining clear docstrings on the provide_* methods clarifies their purpose and usage within dependency injection. This reduces onboarding time for new contributors and makes the codebase more maintainable.

You could briefly describe each returned object or highlight any conditional logic used.

Also applies to: 94-114

dpti/workflows/service/workflow_decorator.py (1)

66-66: Use built-in type hints instead of Dict.

For consistency with Python 3.9+ style, replace Dict[str, Any] with dict[str, Any]. This modern style is recommended by Python’s typing guidelines.

- def task_input_json_hash(context: object, arguments: Dict[str, Any]) -> Optional[str]:
+ def task_input_json_hash(context: object, arguments: dict[str, Any]) -> Optional[str]:
🧰 Tools
🪛 Ruff (0.8.2)

66-66: Use dict instead of Dict for type annotation

Replace with dict

(UP006)

dpti/workflows/service/file_handler.py (5)

7-7: Prefer importing Iterator from collections.abc
For Python 3.9+ and above, it’s recommended to use from collections.abc import Iterator instead of from typing import Iterator.

🧰 Tools
🪛 Ruff (0.8.2)

7-7: Import from collections.abc instead: Iterator

Import from collections.abc

(UP035)


7-7: typing.List is deprecated, use list instead

(UP035)


24-25: Adopt modern type annotations over typing.List
For Python 3.9 and later, you can use native type hints like list[str], list[int], etc., thus reducing reliance on typing.List.

Also applies to: 38-38, 80-80, 147-147

🧰 Tools
🪛 Ruff (0.8.2)

24-24: Use list instead of List for type annotation

Replace with list

(UP006)


25-25: Use list instead of List for type annotation

Replace with list

(UP006)


69-89: Consider calling setup_flow_running_dir in the constructor
Currently, setup_flow_running_dir is commented out, requiring extra calls before this handler becomes fully operational. Executing it within __init__ might prevent potential usage errors.

🧰 Tools
🪛 Ruff (0.8.2)

80-80: Use list instead of List for type annotation

Replace with list

(UP006)


137-145: Add error handling for existing symlinks in upload_file
If the target symlink already exists, os.symlink may raise a FileExistsError. Consider verifying or removing old symlinks before creating a new one.


251-251: Use f-strings for clearer formatting
Improve readability by replacing percent-formatting:

- bk_dirname = dirname + ".bk%03d" % counter
+ bk_dirname = f"{dirname}.bk{counter:03d}"
🧰 Tools
🪛 Ruff (0.8.2)

251-251: Use format specifiers instead of percent format

(UP031)

dpti/lib/utils.py (4)

59-59: Favor f-strings over legacy % formatting
In Python 3+, you can upgrade the line below:

- bk_dirname = dirname + ".bk%03d" % counter
+ bk_dirname = f"{dirname}.bk{counter:03d}"
🧰 Tools
🪛 Ruff (0.8.2)

59-59: Use format specifiers instead of percent format

(UP031)


107-134: Potential memory overhead in block_avg
When inp is large, slicing into data chunks can temporarily use more memory. Consider a streaming solution if data sizes become substantial.


136-145: Add error handling for external converter
sp.check_call(cmd_line, shell=True) raises an exception if ovito_file_convert.py is missing or fails. Capture errors and provide a more descriptive message to users.


423-452: Improve documentation for compute_nrefine
Add a docstring clarifying the shapes and requirements for all_t, integrand, and how the intervals are computed. This helps maintainability and reduces confusion.

dpti/equi.py (3)

52-54: Replace percent-formatting with f-strings
Improve clarity by switching to f-strings in lines where %d is used:

- ret += "variable        NSTEPS          equal %d\n" % nsteps
+ ret += f"variable        NSTEPS          equal {nsteps}"

Also applies to: 74-74, 153-153, 479-479, 497-497

🧰 Tools
🪛 Ruff (0.8.2)

52-52: Use format specifiers instead of percent format

(UP031)


53-53: Use format specifiers instead of percent format

(UP031)


54-54: Use format specifiers instead of percent format

(UP031)


153-155: Ensure reproducible random seeds
Seeding with np.random.default_rng().integers(1, 2**16) can vary each run. Consider allowing users to fix the seed for consistent simulation results.

🧰 Tools
🪛 Ruff (0.8.2)

153-153: Use format specifiers instead of percent format

(UP031)


305-334: Add docstrings or comments for npt_equi_conf
Document how this function calculates and applies block-averaged box dimensions. Clear guidance helps maintainers understand the approach to resizing the simulation cell.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0dac766 and 34b86be.

⛔ Files ignored due to path filters (11)
  • tests/benchmark_gdi/deepmd/0/thermo.out is excluded by !**/*.out
  • tests/benchmark_hti/one_step/task.000006/lambda.out is excluded by !**/*.out
  • tests/benchmark_hti/three_step/00.lj_on/task.000004/lambda.out is excluded by !**/*.out
  • tests/benchmark_hti/three_step/01.deep_on/task.000005/lambda.out is excluded by !**/*.out
  • tests/benchmark_hti/three_step/02.spring_off/task.000006/lambda.out is excluded by !**/*.out
  • tests/benchmark_hti/three_step_meam/00.lj_on/task.000004/lambda.out is excluded by !**/*.out
  • tests/benchmark_hti/three_step_meam/01.deep_on/task.000005/lambda.out is excluded by !**/*.out
  • tests/benchmark_hti/three_step_meam/02.spring_off/task.000006/lambda.out is excluded by !**/*.out
  • tests/benchmark_hti/two_step/00.deep_on/task.000005/lambda.out is excluded by !**/*.out
  • tests/benchmark_hti/two_step/00.deep_on/task.000006/lambda.out is excluded by !**/*.out
  • tests/benchmark_hti/two_step/01.spring_off/task.000006/lambda.out is excluded by !**/*.out
📒 Files selected for processing (85)
  • .cursor/rules/dpti-default-ai-rules.mdc (1 hunks)
  • .git_archival.txt (1 hunks)
  • .gitattributes (1 hunks)
  • .github/dependabot.yml (1 hunks)
  • .github/workflows/mirror_gitee.yml (1 hunks)
  • .github/workflows/release.yml (1 hunks)
  • .github/workflows/release_conda_and_github.yml (1 hunks)
  • .github/workflows/unittest.yml (1 hunks)
  • .gitignore (1 hunks)
  • .pre-commit-config.yaml (1 hunks)
  • .readthedocs.yaml (1 hunks)
  • .travis.yml (1 hunks)
  • .vscode/settings.json (1 hunks)
  • LICENSE (1 hunks)
  • README.md (1 hunks)
  • codecov.yml (1 hunks)
  • conda/construct.yaml (1 hunks)
  • conda/dpti/conda_build_config.yaml (1 hunks)
  • conda/dpti/meta.yaml (1 hunks)
  • docker/Dockerfile (1 hunks)
  • docker/README.md (1 hunks)
  • docker/docker.sh (1 hunks)
  • docker/entry.sh (1 hunks)
  • docs/.gitignore (1 hunks)
  • docs/Makefile (1 hunks)
  • docs/cli.rst (1 hunks)
  • docs/conf.py (1 hunks)
  • docs/credits.rst (1 hunks)
  • docs/examples/water.md (1 hunks)
  • docs/getting-started.md (1 hunks)
  • docs/index.rst (1 hunks)
  • docs/install.md (1 hunks)
  • docs/main.tex (12 hunks)
  • docs/make.bat (1 hunks)
  • docs/requirements.txt (1 hunks)
  • dpti/__init__.py (1 hunks)
  • dpti/dags/dp_ti_gdi.py (1 hunks)
  • dpti/dags/utils.py (1 hunks)
  • dpti/einstein.py (1 hunks)
  • dpti/equi.py (1 hunks)
  • dpti/gdi.py (1 hunks)
  • dpti/hti.py (1 hunks)
  • dpti/hti_ice.py (1 hunks)
  • dpti/hti_liq.py (1 hunks)
  • dpti/hti_water.py (1 hunks)
  • dpti/lib/RemoteJob.py (1 hunks)
  • dpti/lib/dump.py (1 hunks)
  • dpti/lib/lammps.py (1 hunks)
  • dpti/lib/lmp.py (1 hunks)
  • dpti/lib/ovito_file_convert.py (1 hunks)
  • dpti/lib/utils.py (1 hunks)
  • dpti/lib/vasp.py (1 hunks)
  • dpti/lib/water.py (2 hunks)
  • dpti/main.py (1 hunks)
  • dpti/main.tex (1 hunks)
  • dpti/mti.py (1 hunks)
  • dpti/old_equi.py (1 hunks)
  • dpti/relax.py (1 hunks)
  • dpti/ti.py (1 hunks)
  • dpti/ti_water.py (1 hunks)
  • dpti/workflows/__init__.py (1 hunks)
  • dpti/workflows/analyze/build_gibbs_surface.py (1 hunks)
  • dpti/workflows/analyze/compare_ti_plot.py (1 hunks)
  • dpti/workflows/examples/Sn_SCAN_high_pressure/beta144xy.lmp (1 hunks)
  • dpti/workflows/examples/Sn_SCAN_high_pressure/liquid128.lmp (1 hunks)
  • dpti/workflows/examples/Sn_SCAN_high_pressure/ti_line_beta144xy.yaml (1 hunks)
  • dpti/workflows/examples/Sn_SCAN_high_pressure/ti_line_long_liquid_1200K_10GPa_path_t_settings.yaml (1 hunks)
  • dpti/workflows/examples/Sn_SCAN_high_pressure/wget_graph.pb.sh (1 hunks)
  • dpti/workflows/examples/Water_Ice_Phase_Diagram/ice04.lmp (1 hunks)
  • dpti/workflows/examples/Water_Ice_Phase_Diagram/liq_water.lmp (1 hunks)
  • dpti/workflows/examples/Water_Ice_Phase_Diagram/ti_ice04_150K_10000bar_settings.yaml (1 hunks)
  • dpti/workflows/examples/Water_Ice_Phase_Diagram/wget_graph.pb.sh (1 hunks)
  • dpti/workflows/flows/base_flow.py (1 hunks)
  • dpti/workflows/flows/flow_settings.py (1 hunks)
  • dpti/workflows/flows/free_energy_ti_line.py (1 hunks)
  • dpti/workflows/prefect/prefect_task_hash.py (1 hunks)
  • dpti/workflows/service/di.py (1 hunks)
  • dpti/workflows/service/dpdispatcher_configs.py (1 hunks)
  • dpti/workflows/service/file_handler.py (1 hunks)
  • dpti/workflows/service/job_executor.py (1 hunks)
  • dpti/workflows/service/report_generator.py (1 hunks)
  • dpti/workflows/service/workflow_decorator.py (1 hunks)
  • dpti/workflows/service/workflow_service_module.py (1 hunks)
  • dpti/workflows/simulations/base.py (1 hunks)
  • dpti/workflows/simulations/equi_sim.py (1 hunks)
✅ Files skipped from review due to trivial changes (18)
  • docs/credits.rst
  • codecov.yml
  • docs/getting-started.md
  • docs/requirements.txt
  • conda/dpti/conda_build_config.yaml
  • .gitattributes
  • docs/cli.rst
  • .readthedocs.yaml
  • docs/examples/water.md
  • .github/dependabot.yml
  • .git_archival.txt
  • docs/Makefile
  • docs/index.rst
  • docs/.gitignore
  • .vscode/settings.json
  • LICENSE
  • .pre-commit-config.yaml
  • docs/main.tex
🚧 Files skipped from review as they are similar to previous changes (9)
  • dpti/workflows/examples/Sn_SCAN_high_pressure/liquid128.lmp
  • dpti/workflows/service/dpdispatcher_configs.py
  • dpti/workflows/examples/Water_Ice_Phase_Diagram/ice04.lmp
  • dpti/workflows/init.py
  • dpti/workflows/examples/Sn_SCAN_high_pressure/ti_line_beta144xy.yaml
  • dpti/workflows/examples/Sn_SCAN_high_pressure/beta144xy.lmp
  • dpti/workflows/examples/Water_Ice_Phase_Diagram/ti_ice04_150K_10000bar_settings.yaml
  • dpti/workflows/examples/Water_Ice_Phase_Diagram/liq_water.lmp
  • dpti/workflows/service/report_generator.py
🧰 Additional context used
🪛 Shellcheck (0.10.0)
docker/entry.sh

[error] 4-4: Argument mixes string and array. Use * or separate argument.

(SC2145)

docker/docker.sh

[error] 1-1: Tips depend on target shell and yours is unknown. Add a shebang or a 'shell' directive.

(SC2148)

dpti/workflows/examples/Sn_SCAN_high_pressure/wget_graph.pb.sh

[error] 1-1: Tips depend on target shell and yours is unknown. Add a shebang or a 'shell' directive.

(SC2148)

dpti/workflows/examples/Water_Ice_Phase_Diagram/wget_graph.pb.sh

[error] 1-1: Tips depend on target shell and yours is unknown. Add a shebang or a 'shell' directive.

(SC2148)

🪛 YAMLlint (1.35.1)
conda/construct.yaml

[error] 1-1: syntax error: found character '%' that cannot start any token

(syntax)

conda/dpti/meta.yaml

[error] 1-1: syntax error: found character '%' that cannot start any token

(syntax)

🪛 Ruff (0.8.2)
dpti/workflows/service/di.py

10-10: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

dpti/workflows/flows/flow_settings.py

11-11: typing.Type is deprecated, use type instead

(UP035)


23-23: Use type instead of Type for type annotation

Replace with type

(UP006)


83-83: Use is and is not for type comparisons, or isinstance() for isinstance checks

(E721)


91-91: Comment contains ambiguous (FULLWIDTH COMMA). Did you mean , (COMMA)?

(RUF003)

dpti/dags/dp_ti_gdi.py

5-5: typing.Dict is deprecated, use dict instead

(UP035)


33-33: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


38-38: Use dict instead of Dict for type annotation

Replace with dict

(UP006)

dpti/lib/dump.py

130-130: Use format specifiers instead of percent format

(UP031)

dpti/hti_water.py

143-143: Use format specifiers instead of percent format

(UP031)


144-144: Use format specifiers instead of percent format

(UP031)


145-145: Use format specifiers instead of percent format

(UP031)


159-159: Use format specifiers instead of percent format

(UP031)


162-162: Use format specifiers instead of percent format

(UP031)


190-190: Use format specifiers instead of percent format

(UP031)


256-256: Use format specifiers instead of percent format

(UP031)


846-846: Use format specifiers instead of percent format

(UP031)


847-847: Use format specifiers instead of percent format

(UP031)

dpti/gdi.py

244-244: Use format specifiers instead of percent format

(UP031)


245-245: Use format specifiers instead of percent format

(UP031)


254-254: Use format specifiers instead of percent format

(UP031)


261-261: Use format specifiers instead of percent format

(UP031)

dpti/workflows/flows/free_energy_ti_line.py

6-11: typing.Dict is deprecated, use dict instead

(UP035)


77-77: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


97-97: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


165-165: Use dict instead of Dict for type annotation

Replace with dict

(UP006)

dpti/lib/RemoteJob.py

263-263: Use format specifiers instead of percent format

(UP031)


301-301: Use format specifiers instead of percent format

(UP031)


352-352: Use format specifiers instead of percent format

(UP031)


353-353: Use format specifiers instead of percent format

(UP031)


356-356: Use format specifiers instead of percent format

(UP031)


364-364: Use format specifiers instead of percent format

(UP031)


440-440: Use format specifiers instead of percent format

(UP031)


482-482: Use format specifiers instead of percent format

(UP031)


487-487: Use format specifiers instead of percent format

(UP031)


522-522: Use format specifiers instead of percent format

(UP031)

dpti/workflows/analyze/build_gibbs_surface.py

6-6: typing.Dict is deprecated, use dict instead

(UP035)


6-6: typing.List is deprecated, use list instead

(UP035)


6-6: typing.Tuple is deprecated, use tuple instead

(UP035)


26-26: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


82-82: Use list instead of List for type annotation

Replace with list

(UP006)


82-82: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


107-107: Use tuple instead of Tuple for type annotation

Replace with tuple

(UP006)


108-108: Use tuple instead of Tuple for type annotation

Replace with tuple

(UP006)


110-110: Use tuple instead of Tuple for type annotation

Replace with tuple

(UP006)


272-272: Use tuple instead of Tuple for type annotation

Replace with tuple

(UP006)


273-273: Use tuple instead of Tuple for type annotation

Replace with tuple

(UP006)


450-450: Use tuple instead of Tuple for type annotation

Replace with tuple

(UP006)


451-451: Use tuple instead of Tuple for type annotation

Replace with tuple

(UP006)


456-456: Docstring contains ambiguous (FULLWIDTH LEFT PARENTHESIS). Did you mean ( (LEFT PARENTHESIS)?

(RUF002)


456-456: Docstring contains ambiguous (FULLWIDTH RIGHT PARENTHESIS). Did you mean ) (RIGHT PARENTHESIS)?

(RUF002)


533-533: Docstring contains ambiguous (FULLWIDTH COMMA). Did you mean , (COMMA)?

(RUF002)


540-540: String contains ambiguous (FULLWIDTH COMMA). Did you mean , (COMMA)?

(RUF001)


557-557: Comment contains ambiguous (FULLWIDTH COLON). Did you mean : (COLON)?

(RUF003)


557-557: Comment contains ambiguous (FULLWIDTH COMMA). Did you mean , (COMMA)?

(RUF003)


558-558: Comment contains ambiguous (FULLWIDTH COLON). Did you mean : (COLON)?

(RUF003)


558-558: Comment contains ambiguous (FULLWIDTH COMMA). Did you mean , (COMMA)?

(RUF003)


559-559: Comment contains ambiguous (FULLWIDTH COLON). Did you mean : (COLON)?

(RUF003)


559-559: Comment contains ambiguous (FULLWIDTH COMMA). Did you mean , (COMMA)?

(RUF003)


760-760: Comment contains ambiguous (FULLWIDTH COMMA). Did you mean , (COMMA)?

(RUF003)

dpti/mti.py

53-53: Use format specifiers instead of percent format

(UP031)


54-54: Use format specifiers instead of percent format

(UP031)


55-55: Use format specifiers instead of percent format

(UP031)


56-56: Use format specifiers instead of percent format

(UP031)


70-70: Use format specifiers instead of percent format

(UP031)


72-72: Use format specifiers instead of percent format

(UP031)


197-197: Use format specifiers instead of percent format

(UP031)


210-210: Use format specifiers instead of percent format

(UP031)


215-215: Use format specifiers instead of percent format

(UP031)


266-266: Use format specifiers instead of percent format

(UP031)

dpti/hti_liq.py

28-28: Use format specifiers instead of percent format

(UP031)


179-179: Use format specifiers instead of percent format

(UP031)


180-180: Use format specifiers instead of percent format

(UP031)


181-181: Use format specifiers instead of percent format

(UP031)


196-196: Use format specifiers instead of percent format

(UP031)


199-199: Use format specifiers instead of percent format

(UP031)


231-231: Use format specifiers instead of percent format

(UP031)


313-313: Use format specifiers instead of percent format

(UP031)


516-516: Use format specifiers instead of percent format

(UP031)

dpti/relax.py

15-15: Use format specifiers instead of percent format

(UP031)


16-16: Use format specifiers instead of percent format

(UP031)


27-27: Use format specifiers instead of percent format

(UP031)

dpti/lib/lmp.py

157-157: Use format specifiers instead of percent format

(UP031)


177-177: Use format specifiers instead of percent format

(UP031)


178-178: Use format specifiers instead of percent format

(UP031)


191-191: Use format specifiers instead of percent format

(UP031)

dpti/workflows/simulations/base.py

10-21: typing.Dict is deprecated, use dict instead

(UP035)


10-21: typing.List is deprecated, use list instead

(UP035)


10-21: typing.Type is deprecated, use type instead

(UP035)


40-40: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


73-73: Use list instead of List for type annotation

Replace with list

(UP006)


190-190: Use list instead of List for type annotation

Replace with list

(UP006)


191-191: Use list instead of List for type annotation

Replace with list

(UP006)


199-199: Use list instead of List for type annotation

Replace with list

(UP006)


199-199: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)


201-201: Use type instead of Type for type annotation

Replace with type

(UP006)


207-207: Use type instead of Type for type annotation

Replace with type

(UP006)


208-208: Use type instead of Type for type annotation

Replace with type

(UP006)


282-282: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


297-297: Use list instead of List for type annotation

Replace with list

(UP006)


316-316: Use list instead of List for type annotation

Replace with list

(UP006)


317-317: Use list instead of List for type annotation

Replace with list

(UP006)


318-318: Use list instead of List for type annotation

Replace with list

(UP006)


373-373: Use type instead of Type for type annotation

Replace with type

(UP006)


373-373: Use dict instead of Dict for type annotation

Replace with dict

(UP006)

dpti/workflows/simulations/equi_sim.py

2-2: typing.Dict is deprecated, use dict instead

(UP035)


2-2: typing.Type is deprecated, use type instead

(UP035)


20-20: Redefinition of unused SettingsBase from line 16

(F811)


20-20: Redefinition of unused SimulationBase from line 16

(F811)


20-20: Redefinition of unused workflow_task from line 16

(F811)


88-88: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


89-89: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


105-105: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


110-110: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)


112-112: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)


115-115: Use type instead of Type for type annotation

Replace with type

(UP006)


137-137: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


174-174: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


192-192: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


268-268: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


276-276: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


280-280: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)


282-282: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)


284-284: Use type instead of Type for type annotation

Replace with type

(UP006)


296-296: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


347-347: Use dict instead of Dict for type annotation

Replace with dict

(UP006)

dpti/old_equi.py

46-46: Use format specifiers instead of percent format

(UP031)


47-47: Use format specifiers instead of percent format

(UP031)


48-48: Use format specifiers instead of percent format

(UP031)


64-64: Use format specifiers instead of percent format

(UP031)


106-106: Use format specifiers instead of percent format

(UP031)


264-264: Use format specifiers instead of percent format

(UP031)


281-281: Use format specifiers instead of percent format

(UP031)


346-346: Use format specifiers instead of percent format

(UP031)

dpti/lib/water.py

96-96: Use format specifiers instead of percent format

(UP031)


101-101: Use format specifiers instead of percent format

(UP031)


129-129: Use format specifiers instead of percent format

(UP031)


131-131: Use format specifiers instead of percent format

(UP031)


139-139: Use format specifiers instead of percent format

(UP031)


159-159: Use format specifiers instead of percent format

(UP031)


160-160: Use format specifiers instead of percent format

(UP031)


164-164: Use format specifiers instead of percent format

(UP031)


165-165: Use format specifiers instead of percent format

(UP031)


182-182: Use format specifiers instead of percent format

(UP031)

dpti/workflows/service/workflow_service_module.py

16-16: Redefinition of unused IOHandler from line 9

(F811)


16-16: Redefinition of unused LocalFileHandler from line 9

(F811)


17-17: Redefinition of unused DpdispatcherExecutor from line 10

(F811)


17-17: Redefinition of unused JobExecutor from line 10

(F811)


18-18: Redefinition of unused PrefectReportGenerator from line 12

(F811)


18-18: Redefinition of unused ReportGenerator from line 13

(F811)

dpti/workflows/service/workflow_decorator.py

5-5: typing.Dict is deprecated, use dict instead

(UP035)


66-66: Use dict instead of Dict for type annotation

Replace with dict

(UP006)

dpti/workflows/service/file_handler.py

7-7: Import from collections.abc instead: Iterator

Import from collections.abc

(UP035)


7-7: typing.List is deprecated, use list instead

(UP035)


24-24: Use list instead of List for type annotation

Replace with list

(UP006)


25-25: Use list instead of List for type annotation

Replace with list

(UP006)


38-38: Use list instead of List for type annotation

Replace with list

(UP006)


38-38: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)


80-80: Use list instead of List for type annotation

Replace with list

(UP006)


147-147: Use list instead of List for type annotation

Replace with list

(UP006)


147-147: Use list instead of List for type annotation

Replace with list

(UP006)


251-251: Use format specifiers instead of percent format

(UP031)

dpti/ti.py

32-32: Use format specifiers instead of percent format

(UP031)


68-68: Use format specifiers instead of percent format

(UP031)


69-69: Use format specifiers instead of percent format

(UP031)


70-70: Use format specifiers instead of percent format

(UP031)


86-86: Use format specifiers instead of percent format

(UP031)


89-89: Use format specifiers instead of percent format

(UP031)


134-134: Use format specifiers instead of percent format

(UP031)


245-245: Use format specifiers instead of percent format

(UP031)


494-494: Use format specifiers instead of percent format

(UP031)


630-630: Use format specifiers instead of percent format

(UP031)


633-633: Use format specifiers instead of percent format

(UP031)


647-647: Use format specifiers instead of percent format

(UP031)


650-650: Use format specifiers instead of percent format

(UP031)


717-717: Use format specifiers instead of percent format

(UP031)


801-801: Use format specifiers instead of percent format

(UP031)


808-808: Use format specifiers instead of percent format

(UP031)

dpti/equi.py

52-52: Use format specifiers instead of percent format

(UP031)


53-53: Use format specifiers instead of percent format

(UP031)


54-54: Use format specifiers instead of percent format

(UP031)


74-74: Use format specifiers instead of percent format

(UP031)


153-153: Use format specifiers instead of percent format

(UP031)


479-479: Use format specifiers instead of percent format

(UP031)


497-497: Use format specifiers instead of percent format

(UP031)

dpti/hti.py

32-32: Use format specifiers instead of percent format

(UP031)


346-346: Use format specifiers instead of percent format

(UP031)


347-347: Use format specifiers instead of percent format

(UP031)


348-348: Use format specifiers instead of percent format

(UP031)


366-366: Use format specifiers instead of percent format

(UP031)


369-369: Use format specifiers instead of percent format

(UP031)


416-416: Use format specifiers instead of percent format

(UP031)


430-430: Use format specifiers instead of percent format

(UP031)


765-765: Use format specifiers instead of percent format

(UP031)


1197-1197: Use format specifiers instead of percent format

(UP031)

dpti/workflows/analyze/compare_ti_plot.py

117-117: Do not use bare except

(E722)


332-332: typing.Dict is deprecated, use dict instead

(UP035)


332-332: typing.List is deprecated, use list instead

(UP035)


336-336: Use list instead of List for type annotation

Replace with list

(UP006)


336-336: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


337-337: Use list instead of List for type annotation

Replace with list

(UP006)


337-337: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


340-340: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


399-399: Comment contains ambiguous (FULLWIDTH LEFT PARENTHESIS). Did you mean ( (LEFT PARENTHESIS)?

(RUF003)


399-399: Comment contains ambiguous (FULLWIDTH RIGHT PARENTHESIS). Did you mean ) (RIGHT PARENTHESIS)?

(RUF003)


416-416: Comment contains ambiguous (FULLWIDTH LEFT PARENTHESIS). Did you mean ( (LEFT PARENTHESIS)?

(RUF003)


416-416: Comment contains ambiguous (FULLWIDTH RIGHT PARENTHESIS). Did you mean ) (RIGHT PARENTHESIS)?

(RUF003)


434-434: Docstring contains ambiguous (FULLWIDTH LEFT PARENTHESIS). Did you mean ( (LEFT PARENTHESIS)?

(RUF002)


434-434: Docstring contains ambiguous (FULLWIDTH RIGHT PARENTHESIS). Did you mean ) (RIGHT PARENTHESIS)?

(RUF002)


465-465: Comment contains ambiguous (FULLWIDTH COMMA). Did you mean , (COMMA)?

(RUF003)


473-473: Comment contains ambiguous (FULLWIDTH COMMA). Did you mean , (COMMA)?

(RUF003)


473-473: Comment contains ambiguous (FULLWIDTH LEFT PARENTHESIS). Did you mean ( (LEFT PARENTHESIS)?

(RUF003)


473-473: Comment contains ambiguous (FULLWIDTH RIGHT PARENTHESIS). Did you mean ) (RIGHT PARENTHESIS)?

(RUF003)

dpti/lib/utils.py

59-59: Use format specifiers instead of percent format

(UP031)

docs/conf.py

23-23: Use format specifiers instead of percent format

(UP031)

dpti/__init__.py

4-15: __all__ is not sorted

Apply an isort-style sorting to __all__

(RUF022)

dpti/workflows/prefect/prefect_task_hash.py

3-3: typing.Dict is deprecated, use dict instead

(UP035)


60-60: Use dict instead of Dict for type annotation

Replace with dict

(UP006)

🪛 LanguageTool
README.md

[grammar] ~29-~29: Uncountable nouns are usually not used with an indefinite article. Use simply “Deep Potential Water”.
Context: ...ter and ice see: [PRL:Phase Diagram of a Deep Potential Water Model](https://journals.aps.org/prl/abs...

(A_UNCOUNTABLE)


[uncategorized] ~82-~82: The preposition ‘of’ seems more likely in this position.
Context: ...ion files ### HTI This is an example for HTI three-step simulations. ``` cd exa...

(AI_HYDRA_LEO_REPLACE_FOR_OF)


[style] ~92-~92: Consider a shorter alternative to avoid wordiness.
Context: ...step ``` ### TI For temperature 200K, in order to generate integration path pressure from...

(IN_ORDER_TO_PREMIUM)


[style] ~107-~107: Consider a shorter alternative to avoid wordiness.
Context: ...examples/ti/ dpti ti gen ti.t.json ``` In order to generate TI path changing temperature, ...

(IN_ORDER_TO_PREMIUM)


[grammar] ~135-~135: The singular determiner ‘this’ may not agree with the plural noun ‘docs’. Did you mean “these”?
Context: ...-airflow workflow framework. >we refer this docs [airflow official docs](https://ai...

(THIS_NNS)


[style] ~167-~167: Write the name of the item after ‘and’ or use only “etc”.
Context: ... force field,Lammps simulation ensemble and etc. we implement a workflow called TI_tas...

(AND_ETC)


[grammar] ~219-~219: There seems to be a noun/verb agreement error. Did you mean “clones” or “cloned”?
Context: ... ### install dpti and dpdispatcher. git clone the following packages and install.<br ...

(SINGULAR_NOUN_VERB_AGREEMENT)


[grammar] ~360-~360: The usual collocation for technology is “on”, not “in”.
Context: ...ork well, we could type the ip and prot in the web Browser and use the web service to moni...

(IN_THE_INTERNET)


[style] ~389-~389: You have already used this phrasing in nearby sentences. Consider replacing it to add variety to your writing.
Context: ... metal can be various. For example, we want to calculate the free energy of metal Sn o...

(REP_WANT_TO_VB)


[uncategorized] ~389-~389: Possible missing comma found.
Context: ...y of metal Sn of bcc structure at 200 K and 50000 bar (5GPa). In order to caculate ...

(AI_HYDRA_LEO_MISSING_COMMA)


[uncategorized] ~389-~389: Possible missing comma found.
Context: ...200 K and 50000 bar (5GPa). In order to caculate the per atom free energy of metal Sn. F...

(AI_HYDRA_LEO_MISSING_COMMA)


[typographical] ~389-~389: Consider adding a comma after ‘Usually’ for more clarity.
Context: ...ir "ensemble" for lammps MD simulation. Usually the ensemble shoule be consistent with ...

(RB_LY_COMMA)


[uncategorized] ~391-~391: “ti” (solmization) seems less likely than “to
Context: ...ltaneously in x, y, z directions. #### ti-path json Modify the ti.t.json or ti.p....

(AI_HYDRA_LEO_CPT_TI_TO)


[uncategorized] ~392-~392: “ti” (solmization) seems less likely than “to
Context: ...rections. #### ti-path json Modify the ti.t.json or ti.p.json, and change the key...

(AI_HYDRA_LEO_CPT_TI_TO)


[grammar] ~395-~395: Consider using either the past participle “written” or the present participle “writing” here.
Context: ...rflow/stable/concepts.html).This dag is wrote and maintained by the dpti software dev...

(BEEN_PART_AGREEMENT)


[typographical] ~396-~396: Consider adding a comma after ‘Usually’ for more clarity.
Context: .... 2. Wait until the calculation finish. Usually the whole procedure continues for about...

(RB_LY_COMMA)


[uncategorized] ~409-~409: Possible missing comma found.
Context: ...find the crossing point.
For HTI simulation the result will locate at `/home/fengbo...

(AI_HYDRA_LEO_MISSING_COMMA)


[grammar] ~421-~421: This phrase is duplicated. You should probably use “thermodynamic integration” only once.
Context: ...ion 1. NVT MD simulation 1. Hamiltonian thermodynamic integration 4. thermodynamic integration <a name="63776a50b2f742b9a3074845752...

(PHRASE_REPETITION)


[uncategorized] ~428-~428: Use a comma before ‘and’ if it connects two independent clauses (unless they are closely connected and short).
Context: ... NPT simulation Run a long MD simulation and we will get the lattice constant and th...

(COMMA_COMPOUND_SENTENCE)


[uncategorized] ~459-~459: “ti” (solmization) seems less likely than “to
Context: ..._Sn" | see note1. work  directory.  | | ti_path | "t" or "p" | "t" | thermodynamic...

(AI_HYDRA_LEO_CPT_TI_TO)


[uncategorized] ~460-~460: It seems likely that a singular genitive (’s) apostrophe is missing.
Context: ...p | string | "bct.lmp" | see note1. the materials structure to be calculated | | ens | st...

(AI_HYDRA_LEO_APOSTROPHE_S_XS)


[grammar] ~530-~530: It seems that a pronoun is missing.
Context: ...n method | | langevin |  bool |  true | whether use langevin  thermostat  | | nsteps | ...

(IF_VB)


[grammar] ~561-~561: It seems that a pronoun is missing.
Context: ...n method | | langevin |  bool |  true | whether use langevin  thermostat  | | nsteps | ...

(IF_VB)


[grammar] ~644-~644: Did you mean “be treated”?
Context: ... the 100 point value in each chunk will treated as chunk value. And the 490 chunk value...

(WILL_BASED_ON)


[uncategorized] ~706-~706: Possible missing article found.
Context: ... changing from 0 to 1, the structure of system should not change dramatically. (like m...

(AI_HYDRA_LEO_MISSING_THE)


[uncategorized] ~708-~708: Possible missing article found.
Context: ...ue of this system could also be used as indicator. ### integration path: For thermodyna...

(AI_HYDRA_LEO_MISSING_AN)


[uncategorized] ~719-~719: Possible missing comma found.
Context: ...d check the result consistence. If the result errors lie in about 1meV/per atom. (may...

(AI_HYDRA_LEO_MISSING_COMMA)

docker/README.md

[grammar] ~17-~17: After the number ‘8080’, use a plural noun. Did you mean “ports”?
Context: ...s 'dpti' note: 9999:8000 means map 8080 port in the docker container to the 9999 por...

(CD_NNU)

docs/install.md

[grammar] ~4-~4: Did you mean “be installed”?
Context: ...all DPTI ==================== DPTI can installed by pip: bash pip install dpti ...

(WILL_BASED_ON)

🪛 markdownlint-cli2 (0.17.2)
README.md

54-54: Trailing punctuation in heading
Punctuation: ':'

(MD026, no-trailing-punctuation)


58-58: Trailing punctuation in heading
Punctuation: ':'

(MD026, no-trailing-punctuation)


63-63: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)


71-71: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)


84-84: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)


103-103: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)


109-109: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)


121-121: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)


127-127: Trailing punctuation in heading
Punctuation: ':'

(MD026, no-trailing-punctuation)


143-143: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)


149-149: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)


155-155: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)


176-176: Trailing punctuation in heading
Punctuation: '.'

(MD026, no-trailing-punctuation)


191-191: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)


201-201: Trailing punctuation in heading
Punctuation: ':'

(MD026, no-trailing-punctuation)


202-202: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)


218-218: Trailing punctuation in heading
Punctuation: '.'

(MD026, no-trailing-punctuation)


231-231: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)


237-237: Multiple spaces after hash on atx style heading
null

(MD019, no-multiple-space-atx)


237-237: Trailing punctuation in heading
Punctuation: '.'

(MD026, no-trailing-punctuation)


329-329: Multiple spaces after hash on atx style heading
null

(MD019, no-multiple-space-atx)


341-341: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)


350-350: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)


374-374: Spaces inside code span elements
null

(MD038, no-space-in-code)


389-389: Link fragments should be valid
null

(MD051, link-fragments)


389-389: Link fragments should be valid
null

(MD051, link-fragments)


395-395: Link fragments should be valid
null

(MD051, link-fragments)


397-397: Link fragments should be valid
null

(MD051, link-fragments)


639-639: Multiple spaces after hash on atx style heading
null

(MD019, no-multiple-space-atx)


639-639: Trailing punctuation in heading
Punctuation: ':'

(MD026, no-trailing-punctuation)


702-702: Trailing punctuation in heading
Punctuation: ':'

(MD026, no-trailing-punctuation)


704-704: Trailing punctuation in heading
Punctuation: ':'

(MD026, no-trailing-punctuation)


710-710: Trailing punctuation in heading
Punctuation: ':'

(MD026, no-trailing-punctuation)


721-721: Trailing punctuation in heading
Punctuation: ':'

(MD026, no-trailing-punctuation)

🪛 Hadolint (2.12.0)
docker/Dockerfile

[error] 7-7: Do not use sudo as it leads to unpredictable behavior. Use a tool like gosu to enforce root

(DL3004)


[error] 13-13: Do not use sudo as it leads to unpredictable behavior. Use a tool like gosu to enforce root

(DL3004)


[error] 36-36: Do not use sudo as it leads to unpredictable behavior. Use a tool like gosu to enforce root

(DL3004)


[error] 47-47: Do not use sudo as it leads to unpredictable behavior. Use a tool like gosu to enforce root

(DL3004)


[error] 50-50: Do not use sudo as it leads to unpredictable behavior. Use a tool like gosu to enforce root

(DL3004)

🔇 Additional comments (51)
docs/make.bat (1)

1-35: LGTM: Standard Sphinx documentation build script implementation

The batch script follows the standard Sphinx template for building documentation on Windows. It properly handles directory navigation, environment variable checks, error handling, and provides helpful error messages. This is a good addition that complements the documentation efforts mentioned in the summary.

.github/workflows/mirror_gitee.yml (1)

1-14: Mirror to Gitee Workflow Configuration

This workflow is clearly defined with appropriate event triggers (push, delete, and create) and uses concurrency grouping (group: git-mirror) to avoid overlapping tasks. The secret SYNC_GITEE_PRIVATE_KEY is properly referenced. No changes are needed here.

.gitignore (1)

1-24: .gitignore Enhancements

The updated .gitignore covers many build artifacts, temporary files, and log files, and it deliberately “unignores” specific benchmark task files. Please double-check that the inclusion (using “!”) patterns correctly capture all intended benchmark files.

conda/dpti/meta.yaml (2)

1-3: Confirm Jinja templating compatibility with conda-build.
The Jinja templating syntax is valid for conda-build recipes, but some linters (like yamllint) may flag the % character as an error. Verify that you are using conda-build or mambabuild for rendering and not a plain YAML parser.

Would you like me to run a build check script to confirm that the templating logic is recognized and error-free under conda-build?

🧰 Tools
🪛 YAMLlint (1.35.1)

[error] 1-1: syntax error: found character '%' that cannot start any token

(syntax)


16-26: Dependency pinning awareness.
You're pinning a variety of python libraries (e.g., dpdata, dpdispatcher). Verify that these library versions are compatible with each other, and confirm they have no known conflicts or vulnerabilities.

dpti/einstein.py (4)

17-29: Validate physical constants usage in compute_lambda.
The function correctly calculates the thermal de Broglie wavelength, but confirm that the chosen units (Boltzmann constant in J/K, Planck in J·s, etc.) align with your mass units (in kg/mol?). There's a mix of Avogadro scaling and SI conversions. Carefully verify all unit conversions to ensure accurate results.


31-36: Check dimension consistency in compute_spring.
You're scaling spring_k by pc.electron_volt / (pc.angstrom^2) and factoring in temperature. Ensure that the final dimension across these multiplications results in a dimensionally correct value for the spring constant in this context.


74-138: Critical validation of spring constants in free_energy.
You transform a single scalar spring_k into a list m_spring_k by multiplying each mass element. This is plausible if each atom type has a separate spring constant scaled by its mass, but confirm this is scientifically intended. Also, ensure that the chosen pivot atom (first_type) is correct for partial contributions.


140-202: Check for correct Frenkel model assumptions.
Frenkel free energy calculations often depend on assumptions about reference states, the Einstein crystal approach, and how multi-atomic or multi-type systems contribute. Double-check that subtracting 3.0 * np.log(s_Lambda_s) and 1.5 * np.log(sum_m) is valid for your system’s physical conditions.

dpti/lib/dump.py (1)

84-94: Zero-element tilt vector.
By default, you initialize tilt to zero. Confirm that reading from blk[dd] always provides three components for tilt; if the dump is non-triclinic, typically the third offset might be zero or unused. This might cause inconsistent box calculations if the file format is partial.

dpti/lib/RemoteJob.py (4)

21-45: Default resource values.
The _set_default_resource function sets typical HPC constraints, but you store negative memory as a sentinel (-1) and keep empty strings for partition, account, etc. Double-check that HPC environments interpret these defaults in a benign manner.


356-370: Confirm HPC job constraints.
You wrote multiple lines for constraints, licenses, and excludes. Be sure these HPC features match the target Slurm environment. Misuse of constraints or licenses can block job submissions.

🧰 Tools
🪛 Ruff (0.8.2)

356-356: Use format specifiers instead of percent format

(UP031)


364-364: Use format specifiers instead of percent format

(UP031)


482-491: PBS resource specification.
Here you conditionally add ngpus to the select line. Validate that your PBS environment recognizes ngpus. Some HPC centers use different syntax (e.g., gpu_type).

🧰 Tools
🪛 Ruff (0.8.2)

482-482: Use format specifiers instead of percent format

(UP031)


487-487: Use format specifiers instead of percent format

(UP031)


522-526: MPI command for PBS.
You spool up mpirun -machinefile $PBS_NODEFILE. This is typical, but some clusters prefer an alternative wrapper or aprun/mpiexec method. Confirm that $PBS_NODEFILE is the correct approach for your environment.

🧰 Tools
🪛 Ruff (0.8.2)

522-522: Use format specifiers instead of percent format

(UP031)

dpti/lib/vasp.py (2)

105-119: LGTM! Good use of f-string formatting

The updated code uses modern f-string formatting for numeric values, which is a good practice for readability and maintainability.


137-137: LGTM! Proper use of f-string

The error message has been updated to use f-strings, which is a good improvement for code readability.

.cursor/rules/dpti-default-ai-rules.mdc (1)

12-62: Consider language consistency

The file mixes Chinese and English content, which may create challenges for team members who don't read both languages.

Consider making this document available in a consistent language or providing translations for better accessibility across the team.

dpti/lib/ovito_file_convert.py (4)

1-5: Good use of a descriptive shebang and docstring.

The introductory comments clearly establish the script’s origin and purpose, which helps maintain readability. No actionable concerns here.


14-23: Argparse usage is clear and straightforward.

Argument names are descriptive, and the help text is sufficient. This ensures users can effectively parse input and output parameters.


41-49: Defining columns only for the LAMMPS dump format is clean and modular.

This selective approach prevents extraneous columns from being included for other formats. Great job keeping the logic concise.


51-55: Exporting the file is straightforward.

The code properly handles the conditional presence of columns. The overall flow from import to export is well-structured, with minimal overhead.

dpti/main.tex (2)

1-5: Latex preamble is well-structured and includes necessary packages.

No obvious issues or missing dependencies. Everything needed for a complex mathematical document appears to be included.


40-428: Document structure, derivations, and references appear consistent.

The mathematical details for partition functions, free energies, and constraints are systematically organized. The theoretical framework is coherent for an Einstein crystal and extended use cases (e.g., water systems). Nicely done.

dpti/dags/dp_ti_gdi.py (4)

5-5: Consider removing Dict import in favor of dict.

In newer Python versions, typing.Dict is deprecated. Replace it with built-in generic syntax.

Apply this diff to modernize the import:

-from typing import ClassVar, Dict
+from typing import ClassVar

Then adjust references accordingly.

🧰 Tools
🪛 Ruff (0.8.2)

5-5: typing.Dict is deprecated, use dict instead

(UP035)


66-99: Robust error handling for main loop execution.

Good use of retry logic in combination with reading JSON configurations and verifying data. This helps ensure resilience to transient issues when reading or parsing.


131-154: Well-structured approach to task creation and submission handling.

The code loads Machine, Resources, and Submission from dictionaries, ensuring a clean separation of concerns. No major issues here.


170-198: DAG run state monitoring is useful.

The wait_until_end method systematically checks for completion or failure states, providing timely feedback on DAG status. This is a neat approach to synchronous DAG tracking.

docs/conf.py (3)

71-90: LGTM: Good implementation of API documentation generation

The run_apidoc function properly sets up the paths and options for generating API documentation using Sphinx's apidoc extension.


92-93: LGTM: Good connection to Sphinx build process

The setup function correctly connects the run_apidoc function to the Sphinx "builder-inited" event, ensuring API documentation is generated during the build process.


96-99: LGTM: Well-configured intersphinx mapping

The intersphinx mapping is correctly set up to link to Python and dargs documentation.

.github/workflows/release_conda_and_github.yml (1)

3-13: LGTM: Well-structured workflow triggers

The workflow is properly configured to be triggered both manually and on tag pushes, which is ideal for a release workflow.

docker/Dockerfile (1)

67-70:

✅ Verification successful

Verify entry.sh file is included in the build context

The Dockerfile copies an entry.sh script that should be in the build context. Ensure this file exists and is correctly set up to start PostgreSQL before running Airflow commands.


🏁 Script executed:

#!/bin/bash
# Check if entry.sh exists in the repository
if [ -f "docker/entry.sh" ]; then
  echo "entry.sh found in docker directory"
  cat docker/entry.sh
else
  echo "entry.sh not found in docker directory"
  # Search for it in the repository
  find . -name "entry.sh" -type f
fi

Length of output: 554


Verified: entry.sh exists in the proper build context and contains the correct startup logic.

After running the verification script, we confirmed that docker/entry.sh is present and includes the commands to start PostgreSQL (via service postgresql start and service postgresql status) before launching the Airflow scheduler and webserver using gosu. Please ensure that the Docker build command is executed with a context that correctly resolves this file (typically the docker directory if using this Dockerfile as-is) and that the file has executable permissions.

README.md (1)

219-219: Address grammar and verb agreement.

Line 219: “git clone the following packages and install.” → Consider “git cloned” or “Clone the following packages and install.”
Line 395: “This dag is wrote and maintained ...” → Replace with “This DAG is written and maintained ...”

Please confirm you’ve updated the wording in these lines to maintain proper tense and clarity.

Also applies to: 395-395

🧰 Tools
🪛 LanguageTool

[grammar] ~219-~219: There seems to be a noun/verb agreement error. Did you mean “clones” or “cloned”?
Context: ... ### install dpti and dpdispatcher. git clone the following packages and install.<br ...

(SINGULAR_NOUN_VERB_AGREEMENT)

dpti/workflows/prefect/prefect_task_hash.py (1)

47-51: Confirm no sensitive data is leaked in debug prints.

Since hash_objects prints entire args/kwargs, confirm that no sensitive PII or credentials could appear. If there's a risk, either sanitize the data or remove prints.

dpti/workflows/examples/Sn_SCAN_high_pressure/ti_line_long_liquid_1200K_10GPa_path_t_settings.yaml (3)

15-15: Verify if 1,000,000 steps is feasible.

One million steps at a 0.002 fs timestep might be very time-consuming in some environments. Ensure resources (e.g., HPC) can handle this length or consider checkpoint controls.


70-75: Check the interplay of these HTI parameters for correctness.

sigma_0_0, epsilon, activation, etc., can drastically affect simulation. Confirm they are intended for the system at 1200K and 10GPa.


96-96: Validate the temperature/pressure sequences.

["200:1200:20", "1200"] for temp_seq and ["0:100000:1000", "100000"] in pres_seq must align with the simulation or post-processing steps. Verify correct usage upstream.

dpti/workflows/service/job_executor.py (1)

96-100: Confirm that all necessary files are included in the forward_files array.

You list in.lammps, *lmp, graph.pb. If you rely on additional scripts or data, add them here or handle them externally.

dpti/relax.py (1)

84-88: Handle missing “Loop time of” marker more robustly.

If the "Loop time of" marker does not appear in the log, idx remains undefined, leading to a potential error. Consider explicitly verifying the line was found before using idx.

for ii in range(len(lines)):
    if "Loop time of" in lines[ii]:
        idx = ii - 1
        break
+ else:
+     raise RuntimeError("Unable to find 'Loop time of' in log.lammps file.")
res = [float(jj) for jj in lines[idx].split()]
return res[1], res[2]
dpti/hti_water.py (2)

88-119: Double-check correctness of partial bond deactivation in _ff_bond_angle_off.
In _ff_bond_angle_off, there's a multiplication (1 - lamb) for bond and angle parameters, but LJ parameters use an inverted factor for epsilon. Confirm that these transformations reflect the intended iterative “turn off” approach for bonds/angles while LJ remains partially scaled. It may be beneficial to add an inline comment clarifying the logic.


437-496: Check division by zero or invalid array indexing in _post_tasks.
The lines that compute de = all_ag_a / all_lambda + all_dp_a and similarly for bond_angle_off rely on denominators like all_lambda or (1 - all_lambda). If any lambda is 0 or 1, this could cause divisions by zero. Verify that protective offsets (e.g., protect_eps) indeed prevent zero denominators.

dpti/workflows/flows/free_energy_ti_line.py (2)

304-349: Refactor _execute_impl() for better clarity.

This method is orchestrating multiple simulation steps (NPT, NVT, HTI, and TI) in a single block. Breaking each phase into separate helper methods or tasks would improve readability and maintainability.


342-342: Verify dictionary access for free_energy_ref_value_point.

Accessing hti_r["free_energy_ref_value_point"] assumes this key always exists. Confirm that HTISimulation reliably provides such a key or add defensive checks to ensure safe dictionary operations.

dpti/hti.py (3)

160-207: Validate if_meam configurations in _ff_lj_off.
When if_meam=True, the code sets up a hybrid/overlay but does not verify the correctness of meam_model content (e.g., "library" key). Consider adding assertions or checks to avoid cryptic failures if meam_model is malformed.


525-631: Confirm file linking and duplication in make_tasks.
When switching from one step to another, the code copies or links models and configuration files. Verify that no extraneous copies linger and that synergy with relative_link_file consistently ensures correct file references in each subtask without duplication.


1289-1371: Evaluate integration edge cases in compute_task.
When manual_pv or npt is used, the final Gibbs free energy calculation merges multiple error terms. Confirm that partial or missing data (e.g., missing pv_err) doesn't produce an inaccurate final result.

dpti/workflows/simulations/equi_sim.py (1)

347-366: Expand error handling in _extract.
You already use equi.post_task to parse results, but if that fails or yields partial data, subsequent steps might break. Incorporating a graceful fallback or user‐friendly error message can safeguard subsequent tasks.

🧰 Tools
🪛 Ruff (0.8.2)

347-347: Use dict instead of Dict for type annotation

Replace with dict

(UP006)

dpti/workflows/simulations/base.py (1)

199-199: Consider converting the class-level mutable list to an instance attribute.

Currently, all_prepared_paths: List[str] = [] is declared at the class level. This can lead to shared state across all instances, causing subtle bugs if multiple instances modify the list concurrently. If each instance requires its own list, move the declaration into the constructor or initialize it in _initialize().

Do you intend for all_prepared_paths to be shared among all instances, or should each instance maintain its own?

🧰 Tools
🪛 Ruff (0.8.2)

199-199: Use list instead of List for type annotation

Replace with list

(UP006)


199-199: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

dpti/workflows/service/workflow_decorator.py (1)

238-244: Mismatch between task_configs and method_settings.

This repeats a previously raised concern that the decorator only checks for "method_settings" at line 149, but is passed "task_configs" at lines 239–244. Consequently, the provided configuration dictionary is ignored for the methods.

Apply this fix to align the parameter name with the code’s expectation:

- @workflow_task_decorator(
-     task_configs={
-         "prepare": {"name": "process_setup"},
-         "run": {"retries": 1},
-         "extract": {"name": "process_results"},
-     }
- )
+ @workflow_task_decorator(
+     method_settings={
+         "prepare": {"name": "process_setup"},
+         "run": {"retries": 1},
+         "extract": {"name": "process_results"},
+     }
+ )
dpti/workflows/service/file_handler.py (1)

111-120: Verify directory backup approach
create_path_with_backup automatically renames the existing directory, which can be surprising or destructive if users expect the original folder to remain in place. Confirm if this is the intended behavior.

dpti/lib/utils.py (1)

53-67: Confirm the directory backup logic in create_path
The function renames an existing directory with a .bkNNN suffix. Ensure users are aware of this behavior or add extra warnings for safety.

🧰 Tools
🪛 Ruff (0.8.2)

59-59: Use format specifiers instead of percent format

(UP031)

dpti/lib/dump.py Outdated
Comment on lines 141 to 159
def split_traj(dump_lines):
marks = []
for idx, ii in enumerate(dump_lines):
if "ITEM: TIMESTEP" in ii:
marks.append(idx)
if len(marks) == 0:
return None
elif len(marks) == 1:
return [dump_lines]
else:
block_size = marks[1] - marks[0]
ret = []
for ii in marks:
ret.append(dump_lines[ii : ii + block_size])
# for ii in range(len(marks)-1):
# assert(marks[ii+1] - marks[ii] == block_size)
return ret
return None

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

split_traj assumes fixed block sizes between timesteps.
You deduce block_size from the first two marks. If future frames differ in size (e.g., optional data columns or irregular timesteps), the method might incorrectly parse or skip data. Consider verifying that each block has consistent or known structure.

@@ -0,0 +1 @@
wget https://huggingface.co/Felix5572/Sn-SCAN-Compressed/resolve/main/graph.pb
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add shebang and error handling to shell script

The script is missing a shebang line and error handling, which are important for script reliability.

-wget https://huggingface.co/Felix5572/Sn-SCAN-Compressed/resolve/main/graph.pb
+#!/bin/bash
+
+# Exit on error, print commands as they're executed
+set -e
+set -x
+
+# URL for the model file
+MODEL_URL="https://huggingface.co/Felix5572/Sn-SCAN-Compressed/resolve/main/graph.pb"
+OUTPUT_FILE="graph.pb"
+
+# Check if wget is available
+if ! command -v wget &> /dev/null; then
+    echo "Error: wget is not installed. Please install it and try again."
+    exit 1
+fi
+
+# Download the file
+echo "Downloading model from ${MODEL_URL}..."
+if wget -q "${MODEL_URL}" -O "${OUTPUT_FILE}"; then
+    echo "Download completed successfully: ${OUTPUT_FILE}"
+else
+    echo "Error: Download failed!"
+    exit 1
+fi
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
wget https://huggingface.co/Felix5572/Sn-SCAN-Compressed/resolve/main/graph.pb
#!/bin/bash
# Exit on error, print commands as they're executed
set -e
set -x
# URL for the model file
MODEL_URL="https://huggingface.co/Felix5572/Sn-SCAN-Compressed/resolve/main/graph.pb"
OUTPUT_FILE="graph.pb"
# Check if wget is available
if ! command -v wget &> /dev/null; then
echo "Error: wget is not installed. Please install it and try again."
exit 1
fi
# Download the file
echo "Downloading model from ${MODEL_URL}..."
if wget -q "${MODEL_URL}" -O "${OUTPUT_FILE}"; then
echo "Download completed successfully: ${OUTPUT_FILE}"
else
echo "Error: Download failed!"
exit 1
fi
🧰 Tools
🪛 Shellcheck (0.10.0)

[error] 1-1: Tips depend on target shell and yours is unknown. Add a shebang or a 'shell' directive.

(SC2148)


💡 Verification agent

🧩 Analysis chain

Verify the integrity of downloaded file

Consider adding file integrity verification to ensure the downloaded file is correct. This is especially important for model files that will be used in simulations.

You could add a checksumming mechanism like:

# Expected checksum (you would need to provide the actual expected value)
EXPECTED_SHA256="expected_checksum_here"

# Verify downloaded file
ACTUAL_SHA256=$(sha256sum "${OUTPUT_FILE}" | awk '{print $1}')
if [ "${ACTUAL_SHA256}" = "${EXPECTED_SHA256}" ]; then
    echo "File integrity verified."
else
    echo "Error: File integrity check failed!"
    echo "Expected: ${EXPECTED_SHA256}"
    echo "Got: ${ACTUAL_SHA256}"
    exit 1
fi

Run this script to verify the current model file's hash:


🏁 Script executed:

#!/bin/bash
# Description: Get the hash of the graph.pb file for integrity checks

# Download the file if it doesn't exist
if [ ! -f "graph.pb" ]; then
  wget -q https://huggingface.co/Felix5572/Sn-SCAN-Compressed/resolve/main/graph.pb
fi

# Calculate and display the hash
echo "SHA256 hash of graph.pb:"
sha256sum graph.pb | awk '{print $1}'

Length of output: 404


Action: Enhance File Integrity Verification and Address Missing Commands

  • The review comment is still valid: the script should verify the integrity of the downloaded file using a checksum.

  • However, our test indicates that the current verification script fails because both wget and sha256sum are missing.

  • Recommendation:

    • Replace or supplement wget and sha256sum with alternatives available in your target environment (e.g., use curl -L -o graph.pb for downloading and shasum -a 256 for checksum computation) or ensure these utilities are installed.
    • Add a checksum verification snippet similar to the one below (updating with the actual expected checksum):
    # Download the file if it doesn't exist (using curl as an alternative)
    if [ ! -f "graph.pb" ]; then
      curl -L -o graph.pb https://huggingface.co/Felix5572/Sn-SCAN-Compressed/resolve/main/graph.pb
    fi
    
    # Expected checksum (replace with the actual checksum)
    EXPECTED_SHA256="expected_checksum_here"
    
    # Calculate the actual checksum (using shasum as an alternative)
    ACTUAL_SHA256=$(shasum -a 256 graph.pb | awk '{print $1}')
    
    if [ "${ACTUAL_SHA256}" = "${EXPECTED_SHA256}" ]; then
        echo "File integrity verified."
    else
        echo "ERROR: File integrity check failed!"
        echo "Expected: ${EXPECTED_SHA256}"
        echo "Got: ${ACTUAL_SHA256}"
        exit 1
    fi
  • Ensure that the environment where this script runs has the required tools or that the script includes appropriate fallbacks.

🧰 Tools
🪛 Shellcheck (0.10.0)

[error] 1-1: Tips depend on target shell and yours is unknown. Add a shebang or a 'shell' directive.

(SC2148)

@@ -0,0 +1 @@
wget https://huggingface.co/Felix5572/Water-Ice-SCAN-Phase-Dirgram/resolve/main/graph.pb
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add a shebang line and fix the URL typo.

This script is missing a shebang line to specify the interpreter. Additionally, there's a typo in the URL: "Phase-Dirgram" should be "Phase-Diagram". Consider adding error handling to check if the download is successful.

-wget https://huggingface.co/Felix5572/Water-Ice-SCAN-Phase-Dirgram/resolve/main/graph.pb
+#!/bin/bash
+set -e
+
+if wget --spider https://huggingface.co/Felix5572/Water-Ice-SCAN-Phase-Diagram/resolve/main/graph.pb 2>/dev/null; then
+  wget https://huggingface.co/Felix5572/Water-Ice-SCAN-Phase-Diagram/resolve/main/graph.pb
+else
+  echo "Error: The file does not exist or is not accessible."
+  exit 1
+fi
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
wget https://huggingface.co/Felix5572/Water-Ice-SCAN-Phase-Dirgram/resolve/main/graph.pb
#!/bin/bash
set -e
if wget --spider https://huggingface.co/Felix5572/Water-Ice-SCAN-Phase-Diagram/resolve/main/graph.pb 2>/dev/null; then
wget https://huggingface.co/Felix5572/Water-Ice-SCAN-Phase-Diagram/resolve/main/graph.pb
else
echo "Error: The file does not exist or is not accessible."
exit 1
fi
🧰 Tools
🪛 Shellcheck (0.10.0)

[error] 1-1: Tips depend on target shell and yours is unknown. Add a shebang or a 'shell' directive.

(SC2148)

.travis.yml Outdated
- pip --version
# command to run tests
script:
- cd tests && coverage run --source=../deepti -m unittest -v && coverage report
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix source path inconsistency.

The source path ../deepti doesn't match the repository name "dpti" as mentioned in the PR summary. This inconsistency could cause testing failures.

-  - cd tests && coverage run --source=../deepti -m unittest -v && coverage report
+  - cd tests && coverage run --source=../dpti -m unittest -v && coverage report
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- cd tests && coverage run --source=../deepti -m unittest -v && coverage report
- cd tests && coverage run --source=../dpti -m unittest -v && coverage report

docker/entry.sh Outdated
#!/bin/bash
service postgresql start
service postgresql status
echo ">>>>>additional command: $@>>>>>"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Fix string and array mixing in echo command.

The script mixes string and array in the echo command for $@. Use proper quoting to avoid potential issues.

-echo ">>>>>additional command: $@>>>>>"
+echo ">>>>>additional command: $*>>>>>"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
echo ">>>>>additional command: $@>>>>>"
echo ">>>>>additional command: $*>>>>>"
🧰 Tools
🪛 Shellcheck (0.10.0)

[error] 4-4: Argument mixes string and array. Use * or separate argument.

(SC2145)

dpti/hti_liq.py Outdated
Comment on lines 405 to 462
def _post_tasks(iter_name, step, natoms):
jdata = json.load(open(os.path.join(iter_name, "in.json")))
stat_skip = jdata["stat_skip"]
stat_bsize = jdata["stat_bsize"]
all_tasks = glob.glob(os.path.join(iter_name, "task.[0-9]*"))
all_tasks.sort()
ntasks = len(all_tasks)

all_lambda = []
all_dp_a = []
all_dp_e = []
all_msd_xyz = []

for ii in all_tasks:
log_name = os.path.join(ii, "log.lammps")
data = get_thermo(log_name)
np.savetxt(os.path.join(ii, "data"), data, fmt="%.6e")
dp_a, dp_e = block_avg(data[:, 8], skip=stat_skip, block_size=stat_bsize)
msd_xyz = data[-1, 12]
dp_a /= natoms
dp_e /= natoms
lmda_name = os.path.join(ii, "lambda.out")
ll = float(open(lmda_name).read())
all_lambda.append(ll)
all_dp_a.append(dp_a)
all_dp_e.append(dp_e)
all_msd_xyz.append(msd_xyz)

all_lambda = np.array(all_lambda)
all_dp_a = np.array(all_dp_a)
all_dp_e = np.array(all_dp_e)
all_msd_xyz = np.array(all_msd_xyz)
de = all_dp_a
all_err = all_dp_e

all_print = []
# all_print.append(np.arange(len(all_lambda)))
all_print.append(all_lambda)
all_print.append(de)
all_print.append(all_err)
all_print.append(all_msd_xyz)
all_print = np.array(all_print)
np.savetxt(
os.path.join(iter_name, "hti.out"),
all_print.T,
fmt="%.8e",
header="lmbda dU dU_err msd_xyz",
)

diff_e, err = integrate(all_lambda, de, all_err)
sys_err = integrate_sys_err(all_lambda, de)

thermo_info = _compute_thermo(
os.path.join(all_tasks[-1], "log.lammps"), natoms, stat_skip, stat_bsize
)

return diff_e, [err, sys_err], thermo_info

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Refactor _post_tasks for cleaner separation of concerns.

The _post_tasks function mixes file I/O, data parsing, and final computations in a single routine. Splitting these concerns into smaller, well-defined functions would boost readability, maintainability, and test coverage.

dpti/hti.py Outdated
Comment on lines 35 to 69
def _ff_lj_on(lamb, model, sparam):
nn = sparam["n"]
alpha_lj = sparam["alpha_lj"]
rcut = sparam["rcut"]
epsilon = sparam["epsilon"]
# sigma = sparam['sigma']
# sigma_oo = sparam['sigma_oo']
# sigma_oh = sparam['sigma_oh']
# sigma_hh = sparam['sigma_hh']
activation = sparam["activation"]
ret = ""
ret += f"variable EPSILON equal {epsilon:f}\n"
ret += f"pair_style lj/cut/soft {nn:f} {alpha_lj:f} {rcut:f}\n"

element_num = sparam.get("element_num", 1)
sigma_key_index = filter(
lambda t: t[0] <= t[1],
((i, j) for i in range(element_num) for j in range(element_num)),
)
for i, j in sigma_key_index:
ret += "pair_coeff {} {} ${{EPSILON}} {:f} {:f}\n".format(
i + 1,
j + 1,
sparam["sigma_" + str(i) + "_" + str(j)],
activation,
)

# ret += 'pair_coeff * * ${EPSILON} %f %f\n' % (sigma, activation)
# ret += 'pair_coeff 1 1 ${EPSILON} %f %f\n' % (sigma_oo, activation)
# ret += 'pair_coeff 1 2 ${EPSILON} %f %f\n' % (sigma_oh, activation)
# ret += 'pair_coeff 2 2 ${EPSILON} %f %f\n' % (sigma_hh, activation)
ret += "fix tot_pot all adapt/fep 0 pair lj/cut/soft epsilon * * v_LAMBDA scale yes\n"
ret += "compute e_diff all fep ${TEMP} pair lj/cut/soft epsilon * * v_EPSILON\n"
return ret

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Clarify parameter assumptions in _ff_lj_on.
This function expects several parameters from sparam (e.g., n, alpha_lj, rcut, etc.) but does not validate their presence or valid ranges. Adding checks or default fallbacks for missing or invalid values could prevent runtime errors and improve code robustness.

dpti/hti.py Outdated
Comment on lines 318 to 449
ret += f"variable PRES equal {pres:f}\n"
ret += f"variable TAU_T equal {tau_t:f}\n"
ret += f"variable TAU_P equal {tau_p:f}\n"
ret += f"variable LAMBDA equal {lamb:.10e}\n"
ret += "variable INV_LAMBDA equal %.10e\n" % (1 - lamb)
if custom_variables is not None:
for key, value in custom_variables.items():
ret += f"variable {key} equal {value}\n"
ret += "# ---------------------- INITIALIZAITION ------------------\n"
ret += "units metal\n"
ret += "boundary p p p\n"
ret += "atom_style atomic\n"
ret += "# --------------------- ATOM DEFINITION ------------------\n"
ret += "box tilt large\n"
ret += f"read_data {conf_file}\n"
if copies is not None:
ret += "replicate %d %d %d\n" % (copies[0], copies[1], copies[2])
ret += "change_box all triclinic\n"
for jj in range(len(mass_map)):
ret += "mass %d %f\n" % (jj + 1, mass_map[jj])

# force field setting
if switch == "one-step" or switch == "two-step":
ret += _ff_two_steps(lamb, model, m_spring_k, step, append)
elif switch == "three-step":
ret += _ff_soft_lj(
lamb,
model,
m_spring_k,
step,
sparam,
if_meam=if_meam,
meam_model=meam_model,
append=append,
)
else:
raise RuntimeError("unknow switch", switch)

ret += "# --------------------- MD SETTINGS ----------------------\n"
ret += "neighbor 1.0 bin\n"
ret += f"timestep {timestep}\n"
ret += "thermo ${THERMO_FREQ}\n"
ret += "compute allmsd all msd\n"
if 1 - lamb != 0:
if not isinstance(m_spring_k, list):
if switch == "three-step":
ret += "thermo_style custom step ke pe etotal enthalpy temp press vol f_l_spring c_e_diff[1] c_allmsd[*]\n"
else:
ret += "thermo_style custom step ke pe etotal enthalpy temp press vol f_l_spring c_e_deep c_allmsd[*]\n"
else:
if switch == "three-step":
ret += "thermo_style custom step ke pe etotal enthalpy temp press vol v_l_spring c_e_diff[1] c_allmsd[*]\n"
else:
ret += "thermo_style custom step ke pe etotal enthalpy temp press vol v_l_spring c_e_deep c_allmsd[*]\n"
else:
if switch == "three-step":
ret += "thermo_style custom step ke pe etotal enthalpy temp press vol c_e_diff[1] c_e_diff[1] c_allmsd[*]\n"
else:
ret += "thermo_style custom step ke pe etotal enthalpy temp press vol c_e_deep c_e_deep c_allmsd[*]\n"
ret += "thermo_modify format 9 %.16e\n"
ret += "thermo_modify format 10 %.16e\n"
ret += "dump 1 all custom ${DUMP_FREQ} dump.hti id type x y z vx vy vz\n"
if ens == "nvt":
ret += "fix 1 all nvt temp ${TEMP} ${TEMP} ${TAU_T}\n"
elif ens == "nvt-langevin":
ret += "fix 1 all nve\n"
ret += "fix 2 all langevin ${TEMP} ${TEMP} ${TAU_T} %d" % (
np.random.default_rng().integers(1, 2**16)
)
if crystal == "frenkel":
ret += " zero yes\n"
else:
ret += " zero no\n"
elif ens == "npt-iso" or ens == "npt":
ret += "fix 1 all npt temp ${TEMP} ${TEMP} ${TAU_T} iso ${PRES} ${PRES} ${TAU_P}\n"
elif ens == "nve":
ret += "fix 1 all nve\n"
else:
raise RuntimeError(f"unknow ensemble {ens}\n")
ret += "# --------------------- INITIALIZE -----------------------\n"
ret += "velocity all create ${TEMP} %d\n" % (
np.random.default_rng().integers(1, 2**16)
)
if crystal == "frenkel":
ret += "fix fc all recenter INIT INIT INIT\n"
ret += "fix fm all momentum 1 linear 1 1 1\n"
ret += "velocity all zero linear\n"
elif crystal == "vega":
ret += "group first id 1\n"
ret += "fix fc first recenter INIT INIT INIT\n"
ret += "fix fm first momentum 1 linear 1 1 1\n"
ret += "velocity first zero linear\n"
else:
raise RuntimeError("unknow crystal " + crystal)
ret += "# --------------------- RUN ------------------------------\n"
ret += "run ${NSTEPS}\n"
ret += "write_data out.lmp\n"

return ret

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Enhance maintainability of _gen_lammps_input.
This function has many parameters and handles multiple condition branches (NVT, NPT, Langevin, etc.). Consider splitting it into smaller functions or dedicated blocks for each ensemble to improve readability and maintainability.

🧰 Tools
🪛 Ruff (0.8.2)

346-346: Use format specifiers instead of percent format

(UP031)


347-347: Use format specifiers instead of percent format

(UP031)


348-348: Use format specifiers instead of percent format

(UP031)


366-366: Use format specifiers instead of percent format

(UP031)


369-369: Use format specifiers instead of percent format

(UP031)


416-416: Use format specifiers instead of percent format

(UP031)


430-430: Use format specifiers instead of percent format

(UP031)

dpti/lib/lmp.py Outdated
Comment on lines 24 to 36
def lmpbox2box(lohi, tilt):
xy = tilt[0]
xz = tilt[1]
yz = tilt[2]
orig = np.array([lohi[0][0], lohi[1][0], lohi[2][0]])
lens = []
for dd in range(3):
lens.append(lohi[dd][1] - lohi[dd][0])
xx = [lens[0], 0, 0]
yy = [xy, lens[1], 0]
zz = [xz, yz, lens[2]]
return orig, np.array([xx, yy, zz])

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Validate box transformations in lmpbox2box and box2lmpbox.
Ensure the logic for computing orig and tilt is consistent across transformations. For example, a mismatch in negative or very small tilt values might lead to incorrect box dimension calculations. Adding checks for extreme or negative lens might be helpful.

Comment on lines +16 to +18
from .file_handler import IOHandler, LocalFileHandler
from .job_executor import DpdispatcherExecutor, JobExecutor
from .report_generator import PrefectReportGenerator, ReportGenerator
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Remove duplicate imports to prevent redefinition.

Lines 16–18 re-import IOHandler, LocalFileHandler, DpdispatcherExecutor, JobExecutor, PrefectReportGenerator, and ReportGenerator, which are already imported at lines 9–14. Maintaining duplicates may lead to confusion or conflicts.

Apply this diff to remove the duplicates:

- from .file_handler import IOHandler, LocalFileHandler
- from .job_executor import DpdispatcherExecutor, JobExecutor
- from .report_generator import PrefectReportGenerator, ReportGenerator
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
from .file_handler import IOHandler, LocalFileHandler
from .job_executor import DpdispatcherExecutor, JobExecutor
from .report_generator import PrefectReportGenerator, ReportGenerator
🧰 Tools
🪛 Ruff (0.8.2)

16-16: Redefinition of unused IOHandler from line 9

(F811)


16-16: Redefinition of unused LocalFileHandler from line 9

(F811)


17-17: Redefinition of unused DpdispatcherExecutor from line 10

(F811)


17-17: Redefinition of unused JobExecutor from line 10

(F811)


18-18: Redefinition of unused PrefectReportGenerator from line 12

(F811)


18-18: Redefinition of unused ReportGenerator from line 13

(F811)

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (1)
dpti/workflows/analyze/compare_ti_plot.py (1)

117-117: ⚠️ Potential issue

Fix the bare except clause.

Using a bare except block catches all exceptions (including system exits and keyboard interrupts), making debugging difficult and potentially hiding serious issues.

Apply the following change to make error handling more specific:

-    except:
+    except Exception:
         print("No intersection point found in the overlapping temperature range")
🧰 Tools
🪛 Ruff (0.8.2)

117-117: Do not use bare except

(E722)

🧹 Nitpick comments (24)
dpti/workflows/analyze/compare_ti_plot.py (9)

170-180: Add implementation or TODO to incomplete function.

The function analyze_align_different_temp_starting_points is currently a stub with a docstring but no implementation.

Consider either:

  1. Implementing the function
  2. Adding a TODO comment to track this for future implementation
  3. Marking it with a "NotImplemented" error
def analyze_align_different_temp_starting_points(result_info: dict) -> str:
    """Analyze and align different temperature starting points in a TI simulation.

    Args:
        result_info: Dictionary containing TI simulation results

    Returns
    -------
    base64 encoded string of the plot image
    """
-    pass
+    # TODO: Implement this function to analyze and align different temperature starting points
+    raise NotImplementedError("This function is not yet implemented")

332-337: Update deprecated type annotations.

The typing.Dict and typing.List annotations are deprecated in favor of using the built-in dict and list types directly.

Update the import and type annotations:

-from typing import Dict, List
+from typing import Optional

def plot_multiple_phase_comparison(
-    phase1_results: List[Dict],
-    phase2_results: List[Dict],
+    phase1_results: list[dict],
+    phase2_results: list[dict],
    phase1_label: str = "Phase 1",
    phase2_label: str = "Phase 2",
-) -> Dict[str, str]:
+) -> dict[str, str]:
🧰 Tools
🪛 Ruff (0.8.2)

332-332: typing.Dict is deprecated, use dict instead

(UP035)


332-332: typing.List is deprecated, use list instead

(UP035)


336-336: Use list instead of List for type annotation

Replace with list

(UP006)


336-336: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


337-337: Use list instead of List for type annotation

Replace with list

(UP006)


337-337: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


362-375: Replace Chinese comments with English comments.

The codebase should maintain consistent language usage throughout. There are several Chinese comments in this section.

Replace the Chinese comments with English equivalents:

-    # 存储所有比较的结果
+    # Store all comparison results
    all_comparisons = []
-    temps = phase1_results[0]["data"]["all_temps"]  # 假设温度点相同
+    temps = phase1_results[0]["data"]["all_temps"]  # Assume temperature points are the same

-    # 生成所有可能的组合比较
+    # Generate all possible combination comparisons
    for p1_result, p2_result in product(phase1_results, phase2_results):

399-405: Replace fullwidth characters in comments.

The comments contain fullwidth characters which should be replaced with their standard ASCII equivalents.

-    # 2. 分布图(在每个温度点)
+    # 2. Distribution plot (at each temperature point)
    plt.figure(figsize=(12, 6))
    plt.violinplot(all_comparisons, positions=temps)
    plt.plot(temps, mean_diff, "r-", linewidth=2, label="Mean")
🧰 Tools
🪛 Ruff (0.8.2)

399-399: Comment contains ambiguous (FULLWIDTH LEFT PARENTHESIS). Did you mean ( (LEFT PARENTHESIS)?

(RUF003)


399-399: Comment contains ambiguous (FULLWIDTH RIGHT PARENTHESIS). Did you mean ) (RIGHT PARENTHESIS)?

(RUF003)


416-424: Replace fullwidth characters in comments.

Similar to the previous issue, this comment contains fullwidth characters.

-    # 3. 平均值比较图(带误差棒)
+    # 3. Mean value comparison plot (with error bars)
    plt.figure(figsize=(10, 6))
    plt.errorbar(temps, mean_diff, yerr=std_diff, capsize=5)
🧰 Tools
🪛 Ruff (0.8.2)

416-416: Comment contains ambiguous (FULLWIDTH LEFT PARENTHESIS). Did you mean ( (LEFT PARENTHESIS)?

(RUF003)


416-416: Comment contains ambiguous (FULLWIDTH RIGHT PARENTHESIS). Did you mean ) (RIGHT PARENTHESIS)?

(RUF003)


433-449: Fix docstring and improve error handling in find_crossing_temp.

The function contains fullwidth characters in the docstring and lacks detailed error handling.

    def find_crossing_temp(fe_diff, temps):
-        """找到自由能差为0的温度点(相变点)."""
+        """Find the temperature point where free energy difference is zero (phase transition point)."""
        from scipy.interpolate import interp1d
        from scipy.optimize import root_scalar

        # 使用三次样条插值
        fe_interp = interp1d(temps, fe_diff, kind="cubic")

        try:
            # 在温度范围内寻找交点
            result = root_scalar(
                fe_interp, bracket=[temps[0], temps[-1]], method="brentq"
            )
            return result.root
-        except ValueError:
+        except ValueError as e:
+            print(f"Unable to find crossing temperature: {e}")
            return None  # 如果在范围内没有交点
🧰 Tools
🪛 Ruff (0.8.2)

434-434: Docstring contains ambiguous (FULLWIDTH LEFT PARENTHESIS). Did you mean ( (LEFT PARENTHESIS)?

(RUF002)


434-434: Docstring contains ambiguous (FULLWIDTH RIGHT PARENTHESIS). Did you mean ) (RIGHT PARENTHESIS)?

(RUF002)


465-477: Replace fullwidth characters in comments.

More fullwidth characters in comments that should be replaced with standard ASCII equivalents.

-    # 使用对称的颜色映射,以平均值为中心
+    # Use symmetric color mapping centered on the mean value
    vmin = mean_temp - 3 * std_temp
    vmax = mean_temp + 3 * std_temp

    sns.heatmap(
        crossing_temps_matrix,
-        annot=True,  # 显示具体数值
-        fmt=".1f",  # 数值格式保留一位小数
-        cmap="RdYlBu_r",  # 使用红-黄-蓝配色,_r表示反转(红色表示高温)
-        center=mean_temp,  # 将颜色中心设为平均值
+        annot=True,  # Show specific values
+        fmt=".1f",  # Number format with one decimal place
+        cmap="RdYlBu_r",  # Use red-yellow-blue color scheme, _r means reversed (red for high temperature)
+        center=mean_temp,  # Set color center to mean value
🧰 Tools
🪛 Ruff (0.8.2)

465-465: Comment contains ambiguous (FULLWIDTH COMMA). Did you mean , (COMMA)?

(RUF003)


473-473: Comment contains ambiguous (FULLWIDTH COMMA). Did you mean , (COMMA)?

(RUF003)


473-473: Comment contains ambiguous (FULLWIDTH LEFT PARENTHESIS). Did you mean ( (LEFT PARENTHESIS)?

(RUF003)


473-473: Comment contains ambiguous (FULLWIDTH RIGHT PARENTHESIS). Did you mean ) (RIGHT PARENTHESIS)?

(RUF003)


143-164: Consider removing or isolating example code.

The file contains substantial example code sections that load specific file paths and execute functions. This makes the file harder to maintain and less suitable as a general-purpose module.

Consider moving this example code to a separate script (e.g., examples/compare_ti_examples.py) and keep only the function definitions in this module. This would:

  1. Improve maintainability
  2. Avoid hardcoded paths
  3. Make the module more reusable
  4. Prevent accidental execution of example code when importing functions

Alternatively, wrap example code in if __name__ == "__main__": blocks if you want to keep it in the same file.

Also applies to: 282-328, 518-582


15-16: Use absolute imports instead of path manipulation.

Using sys.path.insert to modify the Python path is generally discouraged as it can lead to unexpected behavior and makes code less portable.

Consider restructuring the project to use proper Python packaging instead of path manipulation. This would allow you to use normal imports:

-sys.path.insert(0, os.path.join(os.path.dirname(__file__), "../../../"))
-from typing import Optional
-
-from dpti.workflows.simulations.ti_sim import plot_ti_result
+from typing import Optional
+
+from dpti.workflows.simulations.ti_sim import plot_ti_result

Make sure the package is properly installed or in your PYTHONPATH instead.

tests/workflows/flows/test_flow_instance_method.py (7)

12-16: Consider using consistent decorators.

The execute_greeting method has a commented-out @flow decorator. If this method is intended to be used directly as a flow, consider either uncommenting this decorator or removing it to maintain consistency with your final approach of using create_workflow_flow.


14-14: Use English docstrings for consistency.

The docstring "执行问候流程" is in Chinese, while the code itself uses English. Consider translating all docstrings to English for better maintainability and consistency across the codebase.

-    def execute_greeting(self, message: str) -> str:
-        """执行问候流程."""
+    def execute_greeting(self, message: str) -> str:
+        """Execute the greeting process."""

18-22: Use English docstrings for consistency.

The internal method docstring is also in Chinese. Consider translating all docstrings to English for consistency.

-    def _process_greeting(self, message: str) -> str:
-        """实际处理问候消息的内部方法."""
+    def _process_greeting(self, message: str) -> str:
+        """Internal method to process the greeting message."""

25-46: Standardize docstring format and language.

The function docstring mixes languages (Chinese text) and styles (mixing Google-style Args with NumPy-style Returns). Consider:

  1. Translating to English for consistency
  2. Adopting a single docstring style (either Google or NumPy) throughout the codebase
  3. Adding error handling for non-existent methods
 def create_workflow_flow(workflow_class, method_name, **flow_kwargs):
-    """将工作流类方法包装为顶层flow函数.
+    """Wrap a workflow class method as a top-level flow function.
 
     Args:
-        workflow_class: 工作流类
-        method_name: 要包装的方法名
-        flow_kwargs: 传递给flow装饰器的参数
+        workflow_class: The workflow class
+        method_name: The method name to wrap
+        flow_kwargs: Parameters to pass to the flow decorator
 
-    Returns
-    -------
-    装饰后的flow函数
+    Returns:
+        The decorated flow function
     """
 
     @flow(**flow_kwargs)
     def workflow_flow(*args, **kwargs):
-        """创建工作流实例并执行指定方法."""
+        """Create a workflow instance and execute the specified method."""
         instance = workflow_class()
+        if not hasattr(instance, method_name):
+            raise AttributeError(f"Method {method_name} not found in {workflow_class.__name__}")
         method = getattr(instance, method_name)
         return method(*args, **kwargs)

48-51: Translate comment to English for consistency.

The comment above the greeting_flow declaration is in Chinese. Consider translating it to English for consistency.

-# 创建顶层flow函数
+# Create top-level flow function

60-69: Translate comments to English for consistency.

The comments describing the methods are in Chinese. Consider translating them to English for consistency.

 if __name__ == "__main__":
-    # 方法1: 使用包装后的顶层函数
-    # 本地运行
+    # Method 1: Using the wrapped top-level function
+    # Local execution
     # print(greeting_flow(message="world"))
 
-    # 部署到服务器
+    # Deploy to server
     # greeting_flow.serve(
     #     name="greeting-workflow",
     #     parameters={"message": "world"}
     # )

71-71: Translate comment to English for consistency.

The comment is in Chinese. Consider translating it to English for consistency.

-    # 方法2: 直接使用实例的flow方法
+    # Method 2: Directly use the flow method of the instance
dpti/aitools/codebase/collect_py_files_context.py (8)

1-2: Remove Jupyter notebook cell marker

The file starts with a Jupyter notebook cell marker # %% which suggests this code was initially developed in a notebook environment. This marker should be removed for production code.

-# %%
-

6-6: Update type annotations to use dict instead of Dict

The static analysis tool indicates that typing.Dict is deprecated. Python's type annotation system now supports using built-in types directly.

-from typing import Dict
+from typing import Optional, Union  # Keep only if needed elsewhere
🧰 Tools
🪛 Ruff (0.8.2)

6-6: typing.Dict is deprecated, use dict instead

(UP035)


9-9: Update return type annotation to use dict instead of Dict

According to Python typing best practices, use built-in dict instead of Dict from typing module.

-def collect_python_files(directory: str) -> Dict[str, str]:
+def collect_python_files(directory: str) -> dict[str, str]:
🧰 Tools
🪛 Ruff (0.8.2)

9-9: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


30-38: Add error handling for file encoding issues

The code assumes all Python files are UTF-8 encoded, but this might not always be the case. Consider adding specific handling for encoding errors.

-            with open(file_path, encoding="utf-8") as f:
+            with open(file_path, encoding="utf-8", errors="replace") as f:
                 content = f.read()

45-47: Update type annotations to use dict instead of Dict

Replace deprecated Dict with built-in dict type for all type annotations.

-def group_by_directory(
-    files: Dict[str, str], max_files_per_group: int = 20
-) -> Dict[str, Dict[str, str]]:
+def group_by_directory(
+    files: dict[str, str], max_files_per_group: int = 20
+) -> dict[str, dict[str, str]]:
🧰 Tools
🪛 Ruff (0.8.2)

46-46: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


47-47: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


47-47: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


73-93: Consider handling large directories more robustly

The current implementation doesn't handle the case where a single directory contains more files than max_files_per_group. Consider splitting large directories into multiple groups.

 for dir_name, dir_files in dir_groups.items():
     # If adding this directory's files would exceed the limit
     if (
         current_group_size + len(dir_files) > max_files_per_group
         and current_group_size > 0
     ):
         # Save current group and start a new one
         final_groups[f"group_{group_index}"] = current_group
         current_group = {}
         current_group_size = 0
         group_index += 1

+    # Handle case where a single directory has more files than max_files_per_group
+    if len(dir_files) > max_files_per_group:
+        # Split the directory into multiple groups
+        files_list = list(dir_files.items())
+        for i in range(0, len(files_list), max_files_per_group):
+            chunk = dict(files_list[i:i + max_files_per_group])
+            final_groups[f"group_{group_index}"] = chunk
+            group_index += 1
+        continue
+
     # Add files to current group
     current_group.update(dir_files)
     current_group_size += len(dir_files)

96-96: Update type annotations to use dict instead of Dict

Replace deprecated Dict with built-in dict type.

-def save_grouped_json(groups: Dict[str, Dict[str, str]], output_base: str) -> None:
+def save_grouped_json(groups: dict[str, dict[str, str]], output_base: str) -> None:
🧰 Tools
🪛 Ruff (0.8.2)

96-96: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


96-96: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


133-138: Ensure output directory exists

The default output path "build/python_files_context.json" assumes the "build" directory exists. Consider creating the directory if it doesn't exist.

 parser.add_argument(
     "-o",
     "--output",
     default="build/python_files_context.json",
     help="Output JSON file base name (default: python_files_context.json)",
 )

Add this code before saving:

# Ensure output directory exists
output_dir = Path(args.output).parent
output_dir.mkdir(parents=True, exist_ok=True)
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 34b86be and 483544d.

📒 Files selected for processing (3)
  • dpti/aitools/codebase/collect_py_files_context.py (1 hunks)
  • dpti/workflows/analyze/compare_ti_plot.py (1 hunks)
  • tests/workflows/flows/test_flow_instance_method.py (1 hunks)
🧰 Additional context used
🧬 Code Definitions (1)
dpti/workflows/analyze/compare_ti_plot.py (1)
dpti/workflows/simulations/ti_sim.py (1)
  • plot_ti_result (557-648)
🪛 Ruff (0.8.2)
dpti/workflows/analyze/compare_ti_plot.py

117-117: Do not use bare except

(E722)


332-332: typing.Dict is deprecated, use dict instead

(UP035)


332-332: typing.List is deprecated, use list instead

(UP035)


336-336: Use list instead of List for type annotation

Replace with list

(UP006)


336-336: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


337-337: Use list instead of List for type annotation

Replace with list

(UP006)


337-337: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


340-340: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


399-399: Comment contains ambiguous (FULLWIDTH LEFT PARENTHESIS). Did you mean ( (LEFT PARENTHESIS)?

(RUF003)


399-399: Comment contains ambiguous (FULLWIDTH RIGHT PARENTHESIS). Did you mean ) (RIGHT PARENTHESIS)?

(RUF003)


416-416: Comment contains ambiguous (FULLWIDTH LEFT PARENTHESIS). Did you mean ( (LEFT PARENTHESIS)?

(RUF003)


416-416: Comment contains ambiguous (FULLWIDTH RIGHT PARENTHESIS). Did you mean ) (RIGHT PARENTHESIS)?

(RUF003)


434-434: Docstring contains ambiguous (FULLWIDTH LEFT PARENTHESIS). Did you mean ( (LEFT PARENTHESIS)?

(RUF002)


434-434: Docstring contains ambiguous (FULLWIDTH RIGHT PARENTHESIS). Did you mean ) (RIGHT PARENTHESIS)?

(RUF002)


465-465: Comment contains ambiguous (FULLWIDTH COMMA). Did you mean , (COMMA)?

(RUF003)


473-473: Comment contains ambiguous (FULLWIDTH COMMA). Did you mean , (COMMA)?

(RUF003)


473-473: Comment contains ambiguous (FULLWIDTH LEFT PARENTHESIS). Did you mean ( (LEFT PARENTHESIS)?

(RUF003)


473-473: Comment contains ambiguous (FULLWIDTH RIGHT PARENTHESIS). Did you mean ) (RIGHT PARENTHESIS)?

(RUF003)

dpti/aitools/codebase/collect_py_files_context.py

6-6: typing.Dict is deprecated, use dict instead

(UP035)


9-9: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


46-46: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


47-47: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


47-47: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


96-96: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


96-96: Use dict instead of Dict for type annotation

Replace with dict

(UP006)

🔇 Additional comments (1)
tests/workflows/flows/test_flow_instance_method.py (1)

60-75: Move workflow.execute_greeting.serve() call inside a conditional block.

The serve() method call in the main block will execute whenever this file is imported or run, which could be unintended. Consider moving it inside a conditional block or commenting it out.

    workflow = GreetingWorkflow()
-   workflow.execute_greeting.serve(
-       name="greeting-workflow", parameters={"message": "world"}
-   )
+   # Uncomment to deploy to server when explicitly needed
+   # workflow.execute_greeting.serve(
+   #     name="greeting-workflow", parameters={"message": "world"}
+   # )

Comment on lines +54 to +58
@flow
def my_flow(a: str) -> str:
workflow_1 = GreetingWorkflow()
workflow_1.execute_greeting(a)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Return value is missing in my_flow function.

The function my_flow is annotated to return a string (-> str) but doesn't actually return anything. This could cause type checking issues and is misleading.

 @flow
 def my_flow(a: str) -> str:
     workflow_1 = GreetingWorkflow()
-    workflow_1.execute_greeting(a)
+    return workflow_1.execute_greeting(a)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
@flow
def my_flow(a: str) -> str:
workflow_1 = GreetingWorkflow()
workflow_1.execute_greeting(a)
@flow
def my_flow(a: str) -> str:
workflow_1 = GreetingWorkflow()
return workflow_1.execute_greeting(a)

Comment on lines +127 to +132
parser.add_argument(
"-d",
"--directory",
default="../../../dpti/",
help="Directory to scan (default: current directory)",
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Avoid hardcoded relative paths in default arguments

The default directory path "../../../dpti/" is brittle and may break depending on where the script is executed from. Consider using a more robust approach.

 parser.add_argument(
     "-d",
     "--directory",
-    default="../../../dpti/",
+    default=".",
     help="Directory to scan (default: current directory)",
 )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
parser.add_argument(
"-d",
"--directory",
default="../../../dpti/",
help="Directory to scan (default: current directory)",
)
parser.add_argument(
"-d",
"--directory",
default=".",
help="Directory to scan (default: current directory)",
)

Comment on lines +148 to +159

# Collect Python files
python_files = collect_python_files(args.directory)

# Group files
groups = group_by_directory(python_files, args.num_files)

# Save grouped files
save_grouped_json(groups, args.output)

print(f"\nTotal files collected: {len(python_files)}")
print(f"Split into {len(groups)} groups")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add validation and error handling for input directory

The code doesn't check if the specified directory exists before attempting to scan it. Consider adding directory validation.

 args = parser.parse_args()

+# Validate input directory
+input_dir = Path(args.directory)
+if not input_dir.exists() or not input_dir.is_dir():
+    print(f"Error: Directory '{args.directory}' does not exist or is not a directory")
+    exit(1)
+
 # Collect Python files
 python_files = collect_python_files(args.directory)

+# Check if any files were found
+if not python_files:
+    print(f"Warning: No Python files found in {args.directory}")
+
 # Group files
 groups = group_by_directory(python_files, args.num_files)

 # Save grouped files
 save_grouped_json(groups, args.output)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Collect Python files
python_files = collect_python_files(args.directory)
# Group files
groups = group_by_directory(python_files, args.num_files)
# Save grouped files
save_grouped_json(groups, args.output)
print(f"\nTotal files collected: {len(python_files)}")
print(f"Split into {len(groups)} groups")
args = parser.parse_args()
# Validate input directory
input_dir = Path(args.directory)
if not input_dir.exists() or not input_dir.is_dir():
print(f"Error: Directory '{args.directory}' does not exist or is not a directory")
exit(1)
# Collect Python files
python_files = collect_python_files(args.directory)
# Check if any files were found
if not python_files:
print(f"Warning: No Python files found in {args.directory}")
# Group files
groups = group_by_directory(python_files, args.num_files)
# Save grouped files
save_grouped_json(groups, args.output)
print(f"\nTotal files collected: {len(python_files)}")
print(f"Split into {len(groups)} groups")

@felix5572 felix5572 closed this May 23, 2025
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 12

♻️ Duplicate comments (7)
dpti/aitools/codebase/collect_py_files_context.py (2)

127-132: Hardcoded relative path remains unaddressed.

This issue was previously identified but remains unresolved. The hardcoded relative path ../../../dpti/ is brittle and may break depending on execution context.

 parser.add_argument(
     "-d",
     "--directory",
-    default="../../../dpti/",
+    default=".",
     help="Directory to scan (default: current directory)",
 )

148-159: Input validation and error handling still missing.

The previously identified need for directory validation and error handling remains unaddressed.

 args = parser.parse_args()

+# Validate input directory
+input_dir = Path(args.directory)
+if not input_dir.exists() or not input_dir.is_dir():
+    print(f"Error: Directory '{args.directory}' does not exist or is not a directory")
+    exit(1)
+
 # Collect Python files
 python_files = collect_python_files(args.directory)

+# Check if any files were found
+if not python_files:
+    print(f"Warning: No Python files found in {args.directory}")
dpti/workflows/service/job_executor.py (2)

53-54: Check for concurrency issues in file I/O.

If multiple tasks or processes attempt to create the same directory, it might cause race conditions. Consider using a safer technique or verifying usage logs.


66-83: ⚠️ Potential issue

Inconsistent handling of graph.pb file between tasks and common files.

The graph.pb file is included in forward_common_files (line 81) but not in the individual task's forward_files (line 71). This differs from the submit method where graph.pb is included in task forward files (line 107).

Consider whether graph.pb should be:

  1. Only in common files (current approach in group_submit)
  2. In both common and task files
  3. Only in task files (as in submit method)
dpti/workflows/flows/free_energy_ti_line.py (2)

305-350: 🛠️ Refactor suggestion

Break down the _execute_impl method for clarity.

This method orchestrates multiple simulation steps and includes inline calls to NPT, NVT, HTI, and TI simulations. To improve readability and maintainability, consider extracting each simulation stage into separate helper methods or tasks.


305-350: ⚠️ Potential issue

Add robust error handling for sub-step calls.

If any sub-step (like NPTEquiSimulation or TISimulation) fails or returns unexpected data, the code proceeds without safeguards. Incorporating try-except blocks or checks for non-existent keys would help avoid runtime errors and provide clearer failure points.

dpti/workflows/service/workflow_decorator.py (1)

299-305: task_configs not recognized.

Your decorator implementation currently reads method_settings from the configuration, but the code passes task_configs. This likely causes missing or incorrect task configuration.

🧹 Nitpick comments (11)
dpti/aitools/codebase/collect_py_files_context.py (2)

109-120: Fix metadata inconsistency and unusual file extension.

Two issues with the output format:

  1. The .json.txt extension is unconventional and may confuse tools expecting standard JSON files
  2. The base_directory metadata uses the current working directory name instead of the scanned directory name
-        output_file = parent_dir / f"{base_name}_{group_name}.json.txt"
+        output_file = parent_dir / f"{base_name}_{group_name}.json"
         wrapper = {
             "project_files": {
                 "description": f"Python source files collection - {group_name}",
-                "base_directory": Path().absolute().name,
+                "base_directory": Path(args.directory if 'args' in locals() else directory).name,
                 "files": group_files,
             }
         }

Note: Consider passing the original directory parameter to this function for accurate metadata.


67-93: Consider optimizing the grouping algorithm for better performance.

The current grouping logic processes directories sequentially, which may not produce optimal groups for large codebases. For better distribution, consider sorting directories by size first.

     # Merge small groups and split large groups
+    # Sort directories by file count for better distribution
+    sorted_dirs = sorted(dir_groups.items(), key=lambda x: len(x[1]), reverse=True)
+    
     final_groups = {}
     current_group = {}
     current_group_size = 0
     group_index = 1

-    for dir_name, dir_files in dir_groups.items():
+    for dir_name, dir_files in sorted_dirs:

This ensures larger directories are placed first, potentially leading to more balanced groups.

dpti/workflows/service/job_executor.py (1)

35-35: Remove unnecessary pass statement.

The pass statement is redundant when the method already has a body.

-        pass
dpti/gdi.py (1)

290-290: Replace debug print with proper logging.

Debug print statements should use the logging module with appropriate log levels instead of direct print statements.

-        print(f"make_dpdt: mdata: {mdata=}")
+        import logging
+        logging.debug(f"make_dpdt: mdata: {mdata}")
dpti/workflows/flows/free_energy_ti_line.py (1)

78-81: Consider making the default YAML path configurable.

The hardcoded path to the example file may not be portable across different environments or deployment scenarios.

Consider using environment variables or package resources:

import os
from importlib import resources

def load_default_yaml_like_asdict() -> dict[str, Any]:
    # Option 1: Environment variable with fallback
    yaml_like_file = os.environ.get('DPTI_DEFAULT_YAML', 
        os.path.join(os.path.dirname(__file__), 
                     "../examples/Sn_beta_quicktest/ti_line_beta144xy.yaml"))
    
    # Option 2: Use importlib.resources for package data
    # with resources.files('dpti.workflows.examples.Sn_beta_quicktest').joinpath('ti_line_beta144xy.yaml') as yaml_path:
    #     yaml_like_file = str(yaml_path)
dpti/workflows/simulations/base.py (2)

3-3: Remove commented imports

These commented imports are not used and should be removed to keep the code clean.

-# from typing_extensions import Type, TypedDict
 from abc import ABC, abstractmethod
 from dataclasses import dataclass

-# from builtins import AttributeError
 from datetime import datetime, timezone

Also applies to: 7-7


376-376: Remove debug print statements

Debug print statements should be removed from production code.

 def transfer_matching_fields(from_obj: BaseModel, to_type: Type[BaseModel]) -> Dict:
     to_fields = to_type.model_fields
-    # print(f"transfer_matching_fields: {to_fields=} {from_obj=} {to_type=}")
     model_fields_set = from_obj.model_fields_set
     if isinstance(
         model_fields_set, set
     ):  # pydantic BaseModel subclass constructor-built object
         from_data = from_obj.model_dump()
     elif isinstance(
         model_fields_set, dict
     ):  # pydantic BaseModel subclass method model_construct built object
         from_data = model_fields_set.copy()
     else:
         raise ValueError(
             f"must be a set or dict:{model_fields_set=}" f"debug:{from_obj=} {to_type=}"
         )
-    print(f"{from_data=}")

     # ... rest of the function ...
     
-    print(f"transfer_matching_fields:{value_dict=}")
     return value_dict

Also applies to: 390-390, 414-414

dpti/workflows/service/workflow_decorator.py (2)

42-42: Remove debug print statements from hashing functions

Debug print statements should be removed or replaced with logging at appropriate levels.

     hash = h.hexdigest()
-    print(f"stable_hash: end: {hash=} {args=}")
     return hash


 def hash_objects(*args, hash_algo=_md5, **kwargs) -> Optional[str]:
     """Attempt to hash objects by dumping to simplejson JSON."""
-    print(f"hash_objects begin: {args=} {kwargs=} {hash_algo=}")
     try:
         origin_hash = simplejson.dumps((args, kwargs), for_json=True, sort_keys=True)
     except TypeError as e:
         # print(f"hash_objects: error: {e}")
         print(
             f"hash_objects: (args, kwargs) is not serializable:\n"
             f"{args=}"
             f"{kwargs=}"
         )
         raise e

-    print(f"hash_objects:origin_hash: {origin_hash=}")
     hash = stable_hash(origin_hash, hash_algo=hash_algo)

-    print(f"hash_objects: end: {hash=} {args=} {kwargs=}")
     return hash

Also applies to: 54-54, 66-66, 73-73


155-155: Replace print statements with logging

Debug print statements should use proper logging instead.

+import logging
+
+logger = logging.getLogger(__name__)
 # ... other imports ...

             # Override defaults with custom task_kwargs
             default_task_kwargs.update(task_kwargs)
-            print(f"@workflow_task init: {default_task_kwargs=}")
+            logger.debug(f"@workflow_task init: {default_task_kwargs=}")

                 @functools.wraps(original_method)
                 def wrapped_method(self, *args, **kwargs):
-                    print(
-                        f"Running {method_name} with settings: {final_method_settings}"
-                    )
+                    logger.debug(f"Running {method_name} with settings: {final_method_settings}")
                     return original_method(self, *args, **kwargs)

         @functools.wraps(func)
         def wrapped_function(*args, **kwargs):
-            print(f"Running task: {final_function_settings['name']}")
+            logger.debug(f"Running task: {final_function_settings['name']}")
             return func(*args, **kwargs)

             # Override defaults with custom task_kwargs
             default_task_kwargs.update(task_kwargs)
-            print(f"@workflow_task init: {default_task_kwargs=}")
+            logger.debug(f"@workflow_task init: {default_task_kwargs=}")

Also applies to: 191-191, 233-233, 278-278

dpti/workflows/service/file_handler.py (2)

169-170: Use logging for exception messages

Exception messages should use proper logging instead of print statements.

+import logging
+
+logger = logging.getLogger(__name__)
 # ... other imports ...

         except Exception as e:
-            print(f"LocalFileHandler: An exception occurred in chdir_context: {e}")
+            logger.error(f"LocalFileHandler: An exception occurred in chdir_context: {e}")
             raise e

         except Exception as e:
-            print(f"LocalFileHandler: An exception occurred in job_dir_context: {e}")
+            logger.error(f"LocalFileHandler: An exception occurred in job_dir_context: {e}")
             raise e

         except Exception as e:
-            print(f"LocalFileHandler: An exception occurred in subdir_context: {e}")
+            logger.error(f"LocalFileHandler: An exception occurred in subdir_context: {e}")
             raise e

Also applies to: 195-196, 219-220


184-186: Consider using pathlib for better path handling

The code uses string concatenation for paths. Consider using pathlib for more robust path handling.

+from pathlib import Path
 # ... other imports ...

             # Update the job_dirname and job_dir
             self.job_dirname = job_dirname
-            self.job_dir = os.path.join(self.flow_running_dir, job_dirname)
-            print(
-                f"LocalFileHandler: Entering job context {self.flow_running_dir=} {job_dirname=} {self.job_dir=}"
-            )
+            self.job_dir = str(Path(self.flow_running_dir) / job_dirname)
+            logger.debug(
+                f"LocalFileHandler: Entering job context {self.flow_running_dir=} {job_dirname=} {self.job_dir=}"
+            )

             # Update only the job_dir
-            subjob_dir = os.path.join(ori_job_dir, subjob_dirname)
+            subjob_dir = str(Path(ori_job_dir) / subjob_dirname)
             self.job_dir = subjob_dir
-            print(
-                f"LocalFileHandler: Entering subjob_dir context {ori_job_dir=} {subjob_dirname=} {self.job_dir=}"
-            )
+            logger.debug(
+                f"LocalFileHandler: Entering subjob_dir context {ori_job_dir=} {subjob_dirname=} {self.job_dir=}"
+            )

Also applies to: 212-213

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 483544d and c05c9fd.

📒 Files selected for processing (8)
  • .gitignore (1 hunks)
  • dpti/aitools/codebase/collect_py_files_context.py (1 hunks)
  • dpti/gdi.py (5 hunks)
  • dpti/workflows/flows/free_energy_ti_line.py (1 hunks)
  • dpti/workflows/service/file_handler.py (1 hunks)
  • dpti/workflows/service/job_executor.py (1 hunks)
  • dpti/workflows/service/workflow_decorator.py (1 hunks)
  • dpti/workflows/simulations/base.py (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • .gitignore
🧰 Additional context used
🧬 Code Graph Analysis (2)
dpti/workflows/service/file_handler.py (1)
dpti/workflows/flows/free_energy_ti_line.py (3)
  • flow_trigger_dir (113-115)
  • flow_running_dir (159-163)
  • flow_running_dirname (119-130)
dpti/workflows/simulations/base.py (5)
dpti/workflows/service/workflow_decorator.py (7)
  • task_input_json_hash (77-98)
  • workflow_task (242-288)
  • decorator (243-286)
  • wrapper (244-284)
  • task_wrapped (281-282)
  • run (314-317)
  • extract (319-322)
dpti/workflows/prefect/prefect_task_hash.py (1)
  • task_input_json_hash (60-81)
dpti/workflows/service/di.py (2)
  • context_inject (36-90)
  • wrapper (59-88)
dpti/workflows/service/workflow_service_module.py (2)
  • BasicWorkflowServices (33-36)
  • WorkflowServices (23-24)
dpti/workflows/service/file_handler.py (2)
  • upload_file (47-50)
  • upload_file (139-146)
🪛 Ruff (0.11.9)
dpti/gdi.py

246-246: Use format specifiers instead of percent format

(UP031)


247-247: Use format specifiers instead of percent format

(UP031)

dpti/workflows/flows/free_energy_ti_line.py

6-11: typing.Dict is deprecated, use dict instead

(UP035)


77-77: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


97-97: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


166-166: Use dict instead of Dict for type annotation

Replace with dict

(UP006)

dpti/workflows/service/workflow_decorator.py

6-6: typing.Dict is deprecated, use dict instead

(UP035)


77-77: Use dict instead of Dict for type annotation

Replace with dict

(UP006)

dpti/workflows/service/file_handler.py

7-7: Import from collections.abc instead: Iterator

Import from collections.abc

(UP035)


7-7: typing.List is deprecated, use list instead

(UP035)


24-24: Use list instead of List for type annotation

Replace with list

(UP006)


25-25: Use list instead of List for type annotation

Replace with list

(UP006)


38-38: Use list instead of List for type annotation

Replace with list

(UP006)


38-38: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)


82-82: Use list instead of List for type annotation

Replace with list

(UP006)


149-149: Use list instead of List for type annotation

Replace with list

(UP006)


149-149: Use list instead of List for type annotation

Replace with list

(UP006)


266-266: Use format specifiers instead of percent format

(UP031)

dpti/workflows/simulations/base.py

10-21: typing.Dict is deprecated, use dict instead

(UP035)


10-21: typing.List is deprecated, use list instead

(UP035)


10-21: typing.Type is deprecated, use type instead

(UP035)


41-41: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


74-74: Use list instead of List for type annotation

Replace with list

(UP006)


119-119: Redefinition of unused workflow_task from line 28

(F811)


191-191: Use list instead of List for type annotation

Replace with list

(UP006)


192-192: Use list instead of List for type annotation

Replace with list

(UP006)


200-200: Use list instead of List for type annotation

Replace with list

(UP006)


200-200: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)


202-202: Use type instead of Type for type annotation

Replace with type

(UP006)


208-208: Use type instead of Type for type annotation

Replace with type

(UP006)


209-209: Use type instead of Type for type annotation

Replace with type

(UP006)


283-283: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


298-298: Use list instead of List for type annotation

Replace with list

(UP006)


317-317: Use list instead of List for type annotation

Replace with list

(UP006)


318-318: Use list instead of List for type annotation

Replace with list

(UP006)


319-319: Use list instead of List for type annotation

Replace with list

(UP006)


374-374: Use type instead of Type for type annotation

Replace with type

(UP006)


374-374: Use dict instead of Dict for type annotation

Replace with dict

(UP006)

dpti/aitools/codebase/collect_py_files_context.py

6-6: typing.Dict is deprecated, use dict instead

(UP035)


9-9: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


46-46: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


47-47: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


47-47: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


96-96: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


96-96: Use dict instead of Dict for type annotation

Replace with dict

(UP006)

🔇 Additional comments (2)
dpti/gdi.py (2)

300-300: Good improvement in backward file pattern.

Changing from "out.lmp" to "*.lmp" makes the file collection more flexible and ensures all LAMMPS output files are captured.


415-415: Good addition of explicit type casting.

The explicit float casting ensures type safety when accessing array elements, preventing potential type-related issues.

Also applies to: 432-432

from typing import Dict


def collect_python_files(directory: str) -> Dict[str, str]:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Replace deprecated Dict with built-in dict type.

Static analysis correctly identifies that typing.Dict is deprecated. Update all type annotations to use the built-in dict type.

-def collect_python_files(directory: str) -> Dict[str, str]:
+def collect_python_files(directory: str) -> dict[str, str]:

-def group_by_directory(
-    files: Dict[str, str], max_files_per_group: int = 20
-) -> Dict[str, Dict[str, str]]:
+def group_by_directory(
+    files: dict[str, str], max_files_per_group: int = 20
+) -> dict[str, dict[str, str]]:

-def save_grouped_json(groups: Dict[str, Dict[str, str]], output_base: str) -> None:
+def save_grouped_json(groups: dict[str, dict[str, str]], output_base: str) -> None:

Also applies to: 46-47, 96-96

🧰 Tools
🪛 Ruff (0.11.9)

9-9: Use dict instead of Dict for type annotation

Replace with dict

(UP006)

🤖 Prompt for AI Agents
In dpti/aitools/codebase/collect_py_files_context.py at lines 9, 46-47, and 96,
replace all occurrences of the deprecated typing.Dict type annotation with the
built-in dict type. Update the function signatures and any variable annotations
accordingly to use dict instead of Dict for better compatibility and to adhere
to current Python typing standards.

import argparse
import json
from pathlib import Path
from typing import Dict
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Update deprecated type annotations.

The typing.Dict import is deprecated in favor of the built-in dict type (Python 3.9+).

-from typing import Dict
+from typing import Dict  # Remove this line entirely

Committable suggestion skipped: line range outside the PR's diff.

🧰 Tools
🪛 Ruff (0.11.9)

6-6: typing.Dict is deprecated, use dict instead

(UP035)

🤖 Prompt for AI Agents
In dpti/aitools/codebase/collect_py_files_context.py at line 6, replace the
import of the deprecated typing.Dict with the built-in dict type. Remove the
line importing Dict from typing and update any type annotations using Dict to
use the built-in dict instead.

Comment on lines +104 to +107
output_base = Path(output_base)
base_name = output_base.stem
parent_dir = output_base.parent

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add validation for output directory writability.

The function doesn't verify that the output directory exists or is writable before attempting to save files.

     output_base = Path(output_base)
     base_name = output_base.stem
     parent_dir = output_base.parent
+    
+    # Ensure output directory exists and is writable
+    try:
+        parent_dir.mkdir(parents=True, exist_ok=True)
+    except Exception as e:
+        print(f"Error creating output directory {parent_dir}: {e}")
+        return
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
output_base = Path(output_base)
base_name = output_base.stem
parent_dir = output_base.parent
output_base = Path(output_base)
base_name = output_base.stem
parent_dir = output_base.parent
# Ensure output directory exists and is writable
try:
parent_dir.mkdir(parents=True, exist_ok=True)
except Exception as e:
print(f"Error creating output directory {parent_dir}: {e}")
return
🤖 Prompt for AI Agents
In dpti/aitools/codebase/collect_py_files_context.py around lines 104 to 107,
add validation to check if the output directory exists and is writable before
proceeding. Use appropriate methods to verify the directory's existence and
permissions, and handle cases where the directory is missing or not writable by
raising an error or creating the directory as needed.

Comment on lines +246 to +247
conf_0 = os.path.join("database", "task.%06d" % min_idx, "0", "final.lmp")
conf_1 = os.path.join("database", "task.%06d" % min_idx, "1", "final.lmp")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Use f-strings instead of percent formatting.

Replace the outdated percent formatting with f-strings for better readability and consistency with modern Python practices.

-        conf_0 = os.path.join("database", "task.%06d" % min_idx, "0", "final.lmp")
-        conf_1 = os.path.join("database", "task.%06d" % min_idx, "1", "final.lmp")
+        conf_0 = os.path.join("database", f"task.{min_idx:06d}", "0", "final.lmp")
+        conf_1 = os.path.join("database", f"task.{min_idx:06d}", "1", "final.lmp")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
conf_0 = os.path.join("database", "task.%06d" % min_idx, "0", "final.lmp")
conf_1 = os.path.join("database", "task.%06d" % min_idx, "1", "final.lmp")
conf_0 = os.path.join("database", f"task.{min_idx:06d}", "0", "final.lmp")
conf_1 = os.path.join("database", f"task.{min_idx:06d}", "1", "final.lmp")
🧰 Tools
🪛 Ruff (0.11.9)

246-246: Use format specifiers instead of percent format

(UP031)


247-247: Use format specifiers instead of percent format

(UP031)

🤖 Prompt for AI Agents
In dpti/gdi.py at lines 246 to 247, replace the percent (%) string formatting
used in the os.path.join calls with f-strings. Change the format from
"task.%06d" % min_idx to an f-string like f"task.{min_idx:06d}" to improve
readability and align with modern Python conventions.

Comment on lines +6 to +11
from typing import (
Any,
Dict,
Optional,
Union,
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Use built-in dict instead of Dict from typing.

Replace the deprecated Dict import with the built-in dict type for cleaner, more modern type annotations.

 from typing import (
     Any,
-    Dict,
     Optional,
     Union,
 )

And update the type annotations:

-def load_default_yaml_like_asdict() -> Dict[str, Any]:
+def load_default_yaml_like_asdict() -> dict[str, Any]:
-    domain_input_yaml_like: Union[Dict[str, Any], Path, str] = Field(
+    domain_input_yaml_like: Union[dict[str, Any], Path, str] = Field(
-    def base_thermo_updates(self) -> Dict[str, Any]:
+    def base_thermo_updates(self) -> dict[str, Any]:

Also applies to: 77-77, 97-97, 166-166

🧰 Tools
🪛 Ruff (0.11.9)

6-11: typing.Dict is deprecated, use dict instead

(UP035)

🤖 Prompt for AI Agents
In dpti/workflows/flows/free_energy_ti_line.py at lines 6 to 11 and also at
lines 77, 97, and 166, replace the imported `Dict` from typing with the built-in
`dict` type. Remove `Dict` from the import statement and update all type
annotations using `Dict` to use `dict` instead for modern and cleaner type
hints.

Comment on lines +171 to +177
entity_T = TypeVar("entity_T", bound=BaseModel)
SettingsType = TypeVar("SettingsType", bound=BaseModel)

NodeDataType = TypeVar("NodeDataType", bound=BaseModel)
# call_T = TypeVar('call_T', bound=Union[BaseModel, Dict[str, Any]])
ReturnType = TypeVar("ReturnType")

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Remove duplicate type variable definitions

These type variables are already defined on lines 39-44. The duplicates should be removed to avoid confusion.

-entity_T = TypeVar("entity_T", bound=BaseModel)
-SettingsType = TypeVar("SettingsType", bound=BaseModel)
-
-NodeDataType = TypeVar("NodeDataType", bound=BaseModel)
-# call_T = TypeVar('call_T', bound=Union[BaseModel, Dict[str, Any]])
-ReturnType = TypeVar("ReturnType")
-
-
 # %%
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
entity_T = TypeVar("entity_T", bound=BaseModel)
SettingsType = TypeVar("SettingsType", bound=BaseModel)
NodeDataType = TypeVar("NodeDataType", bound=BaseModel)
# call_T = TypeVar('call_T', bound=Union[BaseModel, Dict[str, Any]])
ReturnType = TypeVar("ReturnType")
# %%
🤖 Prompt for AI Agents
In dpti/workflows/simulations/base.py between lines 171 and 177, remove the
duplicate definitions of the type variables entity_T, SettingsType,
NodeDataType, and ReturnType, as they are already defined earlier in the file
around lines 39 to 44. This will prevent confusion and maintain a single source
of truth for these type variables.

# startup_entity: entity_T
# runtime_entity: entity_T
runtime_nodedata: NodeDataType
all_prepared_paths: List[str] = []
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Use ClassVar for mutable class attributes

Mutable class attributes should be annotated with typing.ClassVar to prevent unintended sharing between instances.

+from typing import ClassVar
 # ... other imports ...

 class SimulationBase(
     Generic[NodeDataType, SettingsType, ReturnType],
     metaclass=SimulationBaseMeta,
 ):
     # ... other attributes ...
-    all_prepared_paths: List[str] = []
+    all_prepared_paths: ClassVar[List[str]] = []
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
all_prepared_paths: List[str] = []
from typing import ClassVar
# ... other imports ...
class SimulationBase(
Generic[NodeDataType, SettingsType, ReturnType],
metaclass=SimulationBaseMeta,
):
# ... other attributes ...
all_prepared_paths: ClassVar[List[str]] = []
🧰 Tools
🪛 Ruff (0.11.9)

200-200: Use list instead of List for type annotation

Replace with list

(UP006)


200-200: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

🤖 Prompt for AI Agents
In dpti/workflows/simulations/base.py at line 200, the mutable class attribute
all_prepared_paths is currently typed as List[str], which can lead to unintended
sharing between instances. To fix this, import ClassVar from typing and annotate
all_prepared_paths as ClassVar[List[str]] to indicate it is a class-level
variable and avoid instance-level mutation issues.

Comment on lines +119 to +166
def workflow_task(method_name: str, **task_kwargs):
def decorator(func):
def wrapper(
*args, **kwargs
): # args[0] is the instance object(self), like class NPTEquiSimulation's instance.
if args:
class_name = args[
0
].__class__.__name__ # get real class name like, `NPTEquiSimulation`
task_name = f"{class_name}_{method_name}"

# Check if cache refresh is needed
instance = args[0]
force_refresh_tasks = getattr(instance, "force_refresh_cache", [])
should_refresh = (
REFRESH_CACHE # Global refresh setting
or method_name in force_refresh_tasks # Task-specific force refresh
)
else:
# Fallback if no instance is provided
class_name = func.__qualname__.split(".")[0]
task_name = f"{class_name}_{method_name}"
should_refresh = REFRESH_CACHE

# Merge default kwargs with custom kwargs
default_task_kwargs = {
"name": task_name,
"task_run_name": lambda: (
f"{task_name}-{datetime.now(timezone.utc).strftime('UTC%z_%Y%m%d_%H%M%S_%f')}"
),
"cache_key_fn": task_input_json_hash,
"persist_result": True,
"refresh_cache": should_refresh,
}
# Override defaults with custom task_kwargs
default_task_kwargs.update(task_kwargs)
print(f"@workflow_task init: {default_task_kwargs=}")

@task(**default_task_kwargs)
def task_wrapped(*task_args, **task_kwargs):
return func(*task_args, **task_kwargs)

return task_wrapped(*args, **kwargs)

return wrapper

return decorator

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Remove duplicate workflow_task function

This function is already imported from dpti.workflows.service.workflow_decorator on line 28. The redefinition causes a naming conflict and should be removed.

-# %%
-def workflow_task(method_name: str, **task_kwargs):
-    def decorator(func):
-        def wrapper(
-            *args, **kwargs
-        ):  # args[0] is the instance object(self), like class NPTEquiSimulation's instance.
-            if args:
-                class_name = args[
-                    0
-                ].__class__.__name__  # get real class name like, `NPTEquiSimulation`
-                task_name = f"{class_name}_{method_name}"
-
-                # Check if cache refresh is needed
-                instance = args[0]
-                force_refresh_tasks = getattr(instance, "force_refresh_cache", [])
-                should_refresh = (
-                    REFRESH_CACHE  # Global refresh setting
-                    or method_name in force_refresh_tasks  # Task-specific force refresh
-                )
-            else:
-                # Fallback if no instance is provided
-                class_name = func.__qualname__.split(".")[0]
-                task_name = f"{class_name}_{method_name}"
-                should_refresh = REFRESH_CACHE
-
-            # Merge default kwargs with custom kwargs
-            default_task_kwargs = {
-                "name": task_name,
-                "task_run_name": lambda: (
-                    f"{task_name}-{datetime.now(timezone.utc).strftime('UTC%z_%Y%m%d_%H%M%S_%f')}"
-                ),
-                "cache_key_fn": task_input_json_hash,
-                "persist_result": True,
-                "refresh_cache": should_refresh,
-            }
-            # Override defaults with custom task_kwargs
-            default_task_kwargs.update(task_kwargs)
-            print(f"@workflow_task init: {default_task_kwargs=}")
-
-            @task(**default_task_kwargs)
-            def task_wrapped(*task_args, **task_kwargs):
-                return func(*task_args, **task_kwargs)
-
-            return task_wrapped(*args, **kwargs)
-
-        return wrapper
-
-    return decorator
-
-
 # %%
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def workflow_task(method_name: str, **task_kwargs):
def decorator(func):
def wrapper(
*args, **kwargs
): # args[0] is the instance object(self), like class NPTEquiSimulation's instance.
if args:
class_name = args[
0
].__class__.__name__ # get real class name like, `NPTEquiSimulation`
task_name = f"{class_name}_{method_name}"
# Check if cache refresh is needed
instance = args[0]
force_refresh_tasks = getattr(instance, "force_refresh_cache", [])
should_refresh = (
REFRESH_CACHE # Global refresh setting
or method_name in force_refresh_tasks # Task-specific force refresh
)
else:
# Fallback if no instance is provided
class_name = func.__qualname__.split(".")[0]
task_name = f"{class_name}_{method_name}"
should_refresh = REFRESH_CACHE
# Merge default kwargs with custom kwargs
default_task_kwargs = {
"name": task_name,
"task_run_name": lambda: (
f"{task_name}-{datetime.now(timezone.utc).strftime('UTC%z_%Y%m%d_%H%M%S_%f')}"
),
"cache_key_fn": task_input_json_hash,
"persist_result": True,
"refresh_cache": should_refresh,
}
# Override defaults with custom task_kwargs
default_task_kwargs.update(task_kwargs)
print(f"@workflow_task init: {default_task_kwargs=}")
@task(**default_task_kwargs)
def task_wrapped(*task_args, **task_kwargs):
return func(*task_args, **task_kwargs)
return task_wrapped(*args, **kwargs)
return wrapper
return decorator
# %%
🧰 Tools
🪛 Ruff (0.11.9)

119-119: Redefinition of unused workflow_task from line 28

(F811)

🤖 Prompt for AI Agents
In dpti/workflows/simulations/base.py between lines 119 and 166, remove the
entire redefinition of the workflow_task function since it is already imported
from dpti.workflows.service.workflow_decorator on line 28. This will eliminate
the naming conflict caused by the duplicate definition.

Comment on lines +94 to +116
class SimulationBaseMeta(type):
def __new__(mcs, name, bases, namespace, **kwargs):
cls = super().__new__(mcs, name, bases, namespace)
orig_base = cls.__orig_bases__[0] # pyright: ignore[reportAttributeAccessIssue]
type_args = get_args(orig_base)
cls.nodedata_type = type_args[0] # pyright: ignore[reportAttributeAccessIssue]
cls.init_type = type_args[1] # pyright: ignore[reportAttributeAccessIssue]
# cls.call_type = type_args[2] # pyright: ignore[reportAttributeAccessIssue]
cls.return_type = type_args[2] # pyright: ignore[reportAttributeAccessIssue]

# for method_name, method in namespace.items():
# if hasattr(method, '_workflow_task_name'):
# task_name = f"{name}_{method._workflow_task_name}"
# decorated_method = task(
# name=task_name,
# cache_key_fn=task_input_json_hash,
# persist_result=True,
# refresh_cache=REFRESH_CACHE
# )(method)
# setattr(cls, method_name, decorated_method)
# cls._class_name = name
return cls

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Add error handling for metaclass type extraction

The metaclass accesses __orig_bases__[0] without checking if it exists, which could raise an AttributeError. Also, get_args() might return an empty tuple.

 class SimulationBaseMeta(type):
     def __new__(mcs, name, bases, namespace, **kwargs):
         cls = super().__new__(mcs, name, bases, namespace)
-        orig_base = cls.__orig_bases__[0]  # pyright: ignore[reportAttributeAccessIssue]
-        type_args = get_args(orig_base)
-        cls.nodedata_type = type_args[0]  # pyright: ignore[reportAttributeAccessIssue]
-        cls.init_type = type_args[1]  # pyright: ignore[reportAttributeAccessIssue]
-        # cls.call_type = type_args[2] # pyright: ignore[reportAttributeAccessIssue]
-        cls.return_type = type_args[2]  # pyright: ignore[reportAttributeAccessIssue]
+        if hasattr(cls, '__orig_bases__') and cls.__orig_bases__:
+            orig_base = cls.__orig_bases__[0]
+            type_args = get_args(orig_base)
+            if len(type_args) >= 3:
+                cls.nodedata_type = type_args[0]
+                cls.init_type = type_args[1]
+                cls.return_type = type_args[2]
+            else:
+                raise TypeError(f"Class {name} must be instantiated with exactly 3 type arguments")
+        else:
+            # Skip type extraction for the base class itself
+            pass

-        # for method_name, method in namespace.items():
-        #     if hasattr(method, '_workflow_task_name'):
-        #         task_name = f"{name}_{method._workflow_task_name}"
-        #         decorated_method = task(
-        #             name=task_name,
-        #             cache_key_fn=task_input_json_hash,
-        #             persist_result=True,
-        #             refresh_cache=REFRESH_CACHE
-        #         )(method)
-        #         setattr(cls, method_name, decorated_method)
-        # cls._class_name = name
-
         return cls
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
class SimulationBaseMeta(type):
def __new__(mcs, name, bases, namespace, **kwargs):
cls = super().__new__(mcs, name, bases, namespace)
orig_base = cls.__orig_bases__[0] # pyright: ignore[reportAttributeAccessIssue]
type_args = get_args(orig_base)
cls.nodedata_type = type_args[0] # pyright: ignore[reportAttributeAccessIssue]
cls.init_type = type_args[1] # pyright: ignore[reportAttributeAccessIssue]
# cls.call_type = type_args[2] # pyright: ignore[reportAttributeAccessIssue]
cls.return_type = type_args[2] # pyright: ignore[reportAttributeAccessIssue]
# for method_name, method in namespace.items():
# if hasattr(method, '_workflow_task_name'):
# task_name = f"{name}_{method._workflow_task_name}"
# decorated_method = task(
# name=task_name,
# cache_key_fn=task_input_json_hash,
# persist_result=True,
# refresh_cache=REFRESH_CACHE
# )(method)
# setattr(cls, method_name, decorated_method)
# cls._class_name = name
return cls
class SimulationBaseMeta(type):
def __new__(mcs, name, bases, namespace, **kwargs):
cls = super().__new__(mcs, name, bases, namespace)
if hasattr(cls, '__orig_bases__') and cls.__orig_bases__:
orig_base = cls.__orig_bases__[0]
type_args = get_args(orig_base)
if len(type_args) >= 3:
cls.nodedata_type = type_args[0]
cls.init_type = type_args[1]
cls.return_type = type_args[2]
else:
raise TypeError(f"Class {name} must be instantiated with exactly 3 type arguments")
else:
# Skip type extraction for the base class itself
pass
return cls
🤖 Prompt for AI Agents
In dpti/workflows/simulations/base.py around lines 94 to 116, the metaclass
SimulationBaseMeta accesses __orig_bases__[0] and uses get_args() without
checks, which can cause AttributeError or IndexError if __orig_bases__ is
missing or get_args returns empty. Add error handling to verify __orig_bases__
exists and is non-empty before accessing [0], and check that get_args(orig_base)
returns the expected number of type arguments before assigning them to class
attributes. Raise a clear error or handle the case gracefully if these
conditions are not met.

Comment on lines +260 to +273
def create_path_with_backup(path: str) -> str:
path += "/"
if os.path.isdir(path):
dirname = os.path.dirname(path)
counter = 0
while True:
bk_dirname = dirname + ".bk%03d" % counter
if not os.path.isdir(bk_dirname):
shutil.move(dirname, bk_dirname)
break
counter += 1
os.makedirs(path)
abs_path = os.path.abspath(path)
return abs_path
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Improve path handling and string formatting

The function modifies the input path by appending "/" which could cause issues. Also, use f-strings instead of percent formatting.

     @staticmethod
     def create_path_with_backup(path: str) -> str:
-        path += "/"
+        # Normalize path without modifying the input
+        normalized_path = os.path.normpath(path)
-        if os.path.isdir(path):
-            dirname = os.path.dirname(path)
+        if os.path.isdir(normalized_path):
             counter = 0
             while True:
-                bk_dirname = dirname + ".bk%03d" % counter
+                bk_dirname = f"{normalized_path}.bk{counter:03d}"
                 if not os.path.isdir(bk_dirname):
-                    shutil.move(dirname, bk_dirname)
+                    shutil.move(normalized_path, bk_dirname)
                     break
                 counter += 1
-        os.makedirs(path)
-        abs_path = os.path.abspath(path)
+        os.makedirs(normalized_path, exist_ok=True)
+        abs_path = os.path.abspath(normalized_path)
         return abs_path
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def create_path_with_backup(path: str) -> str:
path += "/"
if os.path.isdir(path):
dirname = os.path.dirname(path)
counter = 0
while True:
bk_dirname = dirname + ".bk%03d" % counter
if not os.path.isdir(bk_dirname):
shutil.move(dirname, bk_dirname)
break
counter += 1
os.makedirs(path)
abs_path = os.path.abspath(path)
return abs_path
@staticmethod
def create_path_with_backup(path: str) -> str:
# Normalize path without modifying the input
normalized_path = os.path.normpath(path)
if os.path.isdir(normalized_path):
counter = 0
while True:
bk_dirname = f"{normalized_path}.bk{counter:03d}"
if not os.path.isdir(bk_dirname):
shutil.move(normalized_path, bk_dirname)
break
counter += 1
os.makedirs(normalized_path, exist_ok=True)
abs_path = os.path.abspath(normalized_path)
return abs_path
🧰 Tools
🪛 Ruff (0.11.9)

266-266: Use format specifiers instead of percent format

(UP031)

🤖 Prompt for AI Agents
In dpti/workflows/service/file_handler.py around lines 260 to 273, avoid
appending "/" directly to the input path as it may cause incorrect path
handling; instead, use os.path.join or ensure the path is normalized properly.
Replace the percent formatting used in the backup directory naming with
f-strings for clearer and more modern string formatting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.