You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
RWKV7 is currently the strongest RNN-based RNN. I want to provide an RWKV implementation for it.
We have already implemented some basic features for him. For example, layers, models, and tokenizers in the style of keras_hub. You can see them in this repo .
But now I have a problem. Modern RNN models rely heavily on custom operators. For example, in RWKV7, the generalized_delta_rule operator. If we implement it with ops, the efficiency will be very low. Therefore, I need to provide a triton for it to achieve considerable efficiency. The implementation on the torch backend has been completed, and the implementation on the jax backend can also be completed in the near future.
But it seems that keras_hub is not suitable for submitting this kind of operators. May I submit the generalized_delta_rule operator to keras.ops, where the torch and jax backends are implemented by triton in GPU, and the other cases are implemented by keras's ops?
If you think it's not appropriate to add the jax-triton dependency for the JAX backend, we can consider only adding the Triton implementation of the Torch backend. In the new version of Torch, Triton has become a basic dependency. Using Triton operators in the Torch backend will hardly bring any additional burden.
And I will contribute other parts, such as model and layer, to keras_hub.
The text was updated successfully, but these errors were encountered:
There doesn't seem to be a lot of traction behind this feature, but also we suspect the JAX implementation of this custom operation would require some fairly specialized knowledge.
We encourage you to implement these changes in your own library, but don't believe it would be a good fit into Keras at this time.
There doesn't seem to be a lot of traction behind this feature, but also we suspect the JAX implementation of this custom operation would require some fairly specialized knowledge.
We encourage you to implement these changes in your own library, but don't believe it would be a good fit into Keras at this time.
The main question is how the kernel will be handled, and whether keras_hub can accept the submission of the kernel?
If I can't, then maybe I can only use him in my own library.
RWKV7 is currently the strongest RNN-based RNN. I want to provide an RWKV implementation for it.
We have already implemented some basic features for him. For example, layers, models, and tokenizers in the style of keras_hub. You can see them in this repo .
But now I have a problem. Modern RNN models rely heavily on custom operators. For example, in RWKV7, the generalized_delta_rule operator. If we implement it with ops, the efficiency will be very low. Therefore, I need to provide a triton for it to achieve considerable efficiency. The implementation on the torch backend has been completed, and the implementation on the jax backend can also be completed in the near future.
But it seems that keras_hub is not suitable for submitting this kind of operators. May I submit the generalized_delta_rule operator to keras.ops, where the torch and jax backends are implemented by triton in GPU, and the other cases are implemented by keras's ops?
If you think it's not appropriate to add the jax-triton dependency for the JAX backend, we can consider only adding the Triton implementation of the Torch backend. In the new version of Torch, Triton has become a basic dependency. Using Triton operators in the Torch backend will hardly bring any additional burden.
And I will contribute other parts, such as model and layer, to keras_hub.
The text was updated successfully, but these errors were encountered: