-
Notifications
You must be signed in to change notification settings - Fork 3.8k
[POC] Metal via tvm-ffi #18634
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[POC] Metal via tvm-ffi #18634
Conversation
…p setup.py by removing unused import
…command for tvm_cython target
…of BoolOp nodes, improving code clarity.
… details in the error message for better debugging context.
…ity and maintainability.
…ctRef> to Map<String, Any> for improved flexibility.
… compatibility." This reverts commit 9574805.
…ing variable naming consistency.
… Map<String, Any> for enhanced flexibility in handling annotations.
…ith various attributes for enhanced GPU compatibility (apache#7) Co-authored-by: xinyxiao <[email protected]>
* Add tilelang assume attribute to support custom assumption * Add constraint guard in IRMutator
* Add tilelang assume attribute to support custom assumption * Add constraint guard in IRMutator * Fix typo in IR mutator
… improved clarity and efficiency.
- Added support for processing container types like Array that may contain Vars, Buffers, Exprs, and Stmts within the IRConvertSSA class. - Implemented logic to rewrite elements in the container, ensuring proper remapping of variables and buffers. - Improved the mutator's ability to detect changes in the container, updating the value accordingly.
* fix z3 for macos * upd
- Introduced a mechanism to track visiting variables using an unordered set to prevent infinite loops during evaluation. - Added comments to clarify the purpose of the new logic for detecting cycles in variable dependencies.
…ammatic Dependent Launch and cuLaunchCooperativeKernel (apache#18) * [CUDA][FFI] Add support for Programmatic Dependent Kernel Launch (PDL) in TVM CUDA FFI * tir: add launch param tag for programmatic dependent launch * tir: add param tag for cuLaunchCooperativeKernel --------- Co-authored-by: senhtry <[email protected]>
- Introduced an `annotations` field in the `CallNode` class to store additional metadata for lowering passes. - Updated the `Call` constructor and related methods to accept and handle the new `annotations` parameter. - Modified existing calls to `Call` to include the `annotations` argument where applicable, ensuring backward compatibility. - Enhanced the Python interface for the `Call` class to support annotations, improving usability for users needing to pass extra information during function calls.
Summary of ChangesHello @oraluben, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances TVM's arithmetic analysis capabilities by integrating the Z3 SMT solver, allowing for more sophisticated symbolic proofs and constraint satisfaction. It also improves the Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces significant new functionality, most notably the integration of the Z3 SMT solver for more powerful expression proving within the arithmetic analyzer. It also includes changes to support Metal integration via tvm-ffi, likely for interoperability with frameworks like PyTorch on Apple Silicon. Beyond these major features, there are numerous improvements and fixes across the codebase, including enhanced boolean simplification, better handling of dynamic shared memory in CUDA, and Python 3.8 compatibility fixes. My review focuses on a few potentially problematic changes that could affect correctness or cross-platform compatibility, such as the removal of validation checks in TIR nodes and platform-specific code that isn't properly guarded.
I am having trouble creating individual review comments. Click here to see my feedback.
python/tvm/base.py (45)
The flag os.RTLD_LAZY is not available on Windows. This line will raise an AttributeError when running on Windows. This change should be guarded by a platform check, for example if sys.platform != 'win32':.
if sys.platform.startswith("win32"):
lib = ctypes.CDLL(lib_path[0], ctypes.RTLD_GLOBAL)
else:
lib = ctypes.CDLL(lib_path[0], ctypes.RTLD_GLOBAL | os.RTLD_LAZY)
include/tvm/topi/transform.h (1294-1296)
The out-of-bounds value is hardcoded to std::numeric_limits<float>::quiet_NaN(). This will cause issues if the tensor's data type (a->dtype) is not a float (e.g., int, bfloat16). It would be safer to use a type-aware NaN, for example by creating a helper similar to tvm::nan(dtype) that can be used here.
src/tir/ir/expr.cc (796-799)
The check that ensures only the last index of a buffer access can be a vector has been commented out. This seems like a significant change in validation logic. If this is intentional to support more general vectorized access patterns, it could have broad implications for downstream passes that might not be prepared to handle this. Could you clarify the reasoning for this change?
src/tir/ir/stmt.cc (248-253)
The check IsPointerType(buffer_var->type_annotation, dtype) has been removed from the Allocate constructor. This check verified that the data type of the allocation matches the element type of the pointer variable. Removing it could potentially lead to type mismatches that are harder to debug later. What was the motivation for removing this check?
src/target/source/codegen_c.cc (936-948)
This change comments out a deep_equal_ check that prevents a Var from being rebound to a different value within a LetNode. Removing this check might hide potential bugs or lead to incorrect code generation if a variable is indeed redefined with a different expression. Could you clarify the reason for removing this check? If it's no longer needed, a comment explaining why would be helpful. Otherwise, it seems safer to keep this assertion.
CMakeLists.txt (796)
This change comments out the FILE_PREFIX_MAP_FLAG. This flag is useful for creating reproducible builds and for debugging by mapping relative source paths to absolute paths in the debug info. Was removing this intentional? If it was for temporary debugging, it should probably be restored.
src/target/z3/z3_prover_on.cc (111)
The rlimit is set using a float literal 1e4. While this will likely be converted correctly to an integer, it's clearer and safer to use an integer literal 10000 for an unsigned integer parameter.
SetRLimit(10000);
No description provided.