Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement of RWKV7 #21084

Closed
pass-lin opened this issue Mar 24, 2025 · 3 comments
Closed

implement of RWKV7 #21084

pass-lin opened this issue Mar 24, 2025 · 3 comments
Assignees
Labels
keras-team-review-pending Pending review by a Keras team member. type:feature The user is asking for a new feature.

Comments

@pass-lin
Copy link
Contributor

pass-lin commented Mar 24, 2025

RWKV7 is currently the strongest RNN-based RNN. I want to provide an RWKV implementation for it.

We have already implemented some basic features for him. For example, layers, models, and tokenizers in the style of keras_hub. You can see them in this repo .

But now I have a problem. Modern RNN models rely heavily on custom operators. For example, in RWKV7, the generalized_delta_rule operator. If we implement it with ops, the efficiency will be very low. Therefore, I need to provide a triton for it to achieve considerable efficiency. The implementation on the torch backend has been completed, and the implementation on the jax backend can also be completed in the near future.

But it seems that keras_hub is not suitable for submitting this kind of operators. May I submit the generalized_delta_rule operator to keras.ops, where the torch and jax backends are implemented by triton in GPU, and the other cases are implemented by keras's ops?

If you think it's not appropriate to add the jax-triton dependency for the JAX backend, we can consider only adding the Triton implementation of the Torch backend. In the new version of Torch, Triton has become a basic dependency. Using Triton operators in the Torch backend will hardly bring any additional burden.

And I will contribute other parts, such as model and layer, to keras_hub.

@pass-lin
Copy link
Contributor Author

@fchollet

@dhantule dhantule added type:feature The user is asking for a new feature. keras-team-review-pending Pending review by a Keras team member. labels Mar 24, 2025
@VarunS1997
Copy link
Collaborator

There doesn't seem to be a lot of traction behind this feature, but also we suspect the JAX implementation of this custom operation would require some fairly specialized knowledge.

We encourage you to implement these changes in your own library, but don't believe it would be a good fit into Keras at this time.

@pass-lin pass-lin reopened this Mar 27, 2025
@pass-lin
Copy link
Contributor Author

pass-lin commented Mar 27, 2025

There doesn't seem to be a lot of traction behind this feature, but also we suspect the JAX implementation of this custom operation would require some fairly specialized knowledge.

We encourage you to implement these changes in your own library, but don't believe it would be a good fit into Keras at this time.

The main question is how the kernel will be handled, and whether keras_hub can accept the submission of the kernel?
If I can't, then maybe I can only use him in my own library.

@mattdangerw

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
keras-team-review-pending Pending review by a Keras team member. type:feature The user is asking for a new feature.
Projects
None yet
Development

No branches or pull requests

4 participants