Skip to content

Conversation

@oraluben
Copy link
Contributor

@oraluben oraluben commented Jan 5, 2026

No description provided.

Hzfengsy and others added 30 commits June 24, 2025 10:42
… details in the error message for better debugging context.
…ctRef> to Map<String, Any> for improved flexibility.
… Map<String, Any> for enhanced flexibility in handling annotations.
…ith various attributes for enhanced GPU compatibility (apache#7)

Co-authored-by: xinyxiao <[email protected]>
* Add tilelang assume attribute to support custom assumption

* Add constraint guard in IRMutator
* Add tilelang assume attribute to support custom assumption

* Add constraint guard in IRMutator

* Fix typo in IR mutator
LeiWang1999 and others added 27 commits December 14, 2025 16:12
- Added support for processing container types like Array that may contain Vars, Buffers, Exprs, and Stmts within the IRConvertSSA class.
- Implemented logic to rewrite elements in the container, ensuring proper remapping of variables and buffers.
- Improved the mutator's ability to detect changes in the container, updating the value accordingly.
* fix z3 for macos

* upd
- Introduced a mechanism to track visiting variables using an unordered set to prevent infinite loops during evaluation.
- Added comments to clarify the purpose of the new logic for detecting cycles in variable dependencies.
…ammatic Dependent Launch and cuLaunchCooperativeKernel (apache#18)

* [CUDA][FFI] Add support for Programmatic Dependent Kernel Launch (PDL) in TVM CUDA FFI

* tir: add launch param tag for programmatic dependent launch

* tir: add param tag for cuLaunchCooperativeKernel

---------

Co-authored-by: senhtry <[email protected]>
- Introduced an `annotations` field in the `CallNode` class to store additional metadata for lowering passes.
- Updated the `Call` constructor and related methods to accept and handle the new `annotations` parameter.
- Modified existing calls to `Call` to include the `annotations` argument where applicable, ensuring backward compatibility.
- Enhanced the Python interface for the `Call` class to support annotations, improving usability for users needing to pass extra information during function calls.
@oraluben oraluben closed this Jan 5, 2026
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @oraluben, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances TVM's arithmetic analysis capabilities by integrating the Z3 SMT solver, allowing for more sophisticated symbolic proofs and constraint satisfaction. It also improves the Analyzer's state management through deep cloning and expands Python compatibility to version 3.8. Core TIR expressions like Call now support annotations for richer metadata, and scheduling primitives like ReIndex offer more control over simplification. Additionally, the PR includes improvements to C host code generation for better debugging, and introduces advanced features for Metal and CUDA runtime environments.

Highlights

  • Z3 SMT Solver Integration: Introduced a new Z3Prover class and CMake options to integrate the Z3 SMT solver into the arithmetic analyzer, enabling more powerful symbolic proof capabilities for expressions.
  • Analyzer Deep Cloning: The Analyzer class now supports deep cloning, allowing its entire internal state, including all sub-analyzers, to be copied, which is crucial for advanced analysis scenarios.
  • Python 3.8 Compatibility: The minimum Python requirement for TVM has been lowered from 3.9 to 3.8, broadening the supported development environments.
  • TIR Call Annotations: The tir.Call expression now includes an annotations field, allowing arbitrary metadata to be attached to calls, which can be leveraged by various lowering passes.
  • ReIndex Simplification Control: The ReIndex schedule primitive gains a new skip_simplify parameter, providing finer control over whether indices are simplified during reindexing operations.
  • Enhanced C Host Assert Messages: Generated C host code now includes actual left-hand side and right-hand side values in assert messages for equality checks, significantly improving debugging clarity.
  • Metal Stream Integration: New CBStream and SetMetalStream functionalities have been added to improve integration with Metal command buffers, potentially benefiting frameworks like PyTorch MPS.
  • Advanced CUDA Launch Configurations: The CUDA kernel launch mechanism has been updated to support programmatic dependent launch and cooperative launch, alongside more robust handling of dynamic shared memory.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces significant new functionality, most notably the integration of the Z3 SMT solver for more powerful expression proving within the arithmetic analyzer. It also includes changes to support Metal integration via tvm-ffi, likely for interoperability with frameworks like PyTorch on Apple Silicon. Beyond these major features, there are numerous improvements and fixes across the codebase, including enhanced boolean simplification, better handling of dynamic shared memory in CUDA, and Python 3.8 compatibility fixes. My review focuses on a few potentially problematic changes that could affect correctness or cross-platform compatibility, such as the removal of validation checks in TIR nodes and platform-specific code that isn't properly guarded.

I am having trouble creating individual review comments. Click here to see my feedback.

python/tvm/base.py (45)

critical

The flag os.RTLD_LAZY is not available on Windows. This line will raise an AttributeError when running on Windows. This change should be guarded by a platform check, for example if sys.platform != 'win32':.

    if sys.platform.startswith("win32"):
        lib = ctypes.CDLL(lib_path[0], ctypes.RTLD_GLOBAL)
    else:
        lib = ctypes.CDLL(lib_path[0], ctypes.RTLD_GLOBAL | os.RTLD_LAZY)

include/tvm/topi/transform.h (1294-1296)

high

The out-of-bounds value is hardcoded to std::numeric_limits<float>::quiet_NaN(). This will cause issues if the tensor's data type (a->dtype) is not a float (e.g., int, bfloat16). It would be safer to use a type-aware NaN, for example by creating a helper similar to tvm::nan(dtype) that can be used here.

src/tir/ir/expr.cc (796-799)

high

The check that ensures only the last index of a buffer access can be a vector has been commented out. This seems like a significant change in validation logic. If this is intentional to support more general vectorized access patterns, it could have broad implications for downstream passes that might not be prepared to handle this. Could you clarify the reasoning for this change?

src/tir/ir/stmt.cc (248-253)

high

The check IsPointerType(buffer_var->type_annotation, dtype) has been removed from the Allocate constructor. This check verified that the data type of the allocation matches the element type of the pointer variable. Removing it could potentially lead to type mismatches that are harder to debug later. What was the motivation for removing this check?

src/target/source/codegen_c.cc (936-948)

high

This change comments out a deep_equal_ check that prevents a Var from being rebound to a different value within a LetNode. Removing this check might hide potential bugs or lead to incorrect code generation if a variable is indeed redefined with a different expression. Could you clarify the reason for removing this check? If it's no longer needed, a comment explaining why would be helpful. Otherwise, it seems safer to keep this assertion.

CMakeLists.txt (796)

medium

This change comments out the FILE_PREFIX_MAP_FLAG. This flag is useful for creating reproducible builds and for debugging by mapping relative source paths to absolute paths in the debug info. Was removing this intentional? If it was for temporary debugging, it should probably be restored.

src/target/z3/z3_prover_on.cc (111)

medium

The rlimit is set using a float literal 1e4. While this will likely be converted correctly to an integer, it's clearer and safer to use an integer literal 10000 for an unsigned integer parameter.

    SetRLimit(10000);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants