This repo is the official repo of Error-Free Linear Attention is a Free Lunch: Exact Solution from Continuous-Time Dynamics. We formulate the online learning update of delta rule as a continuous-time dynamical system and prove that its exact solution is not only attainable but also computable in linear time with full parallelism. By leveraging the rank-1 structure of the dynamics matrix, we directly derive the exact closed-form solution effectively corresponding to the infinite-order Runge–Kutta method.
Authors: Jingdi Lei, Di Zhang, Soujanya Poria
We release the code to run on sMNIST
- Train and Evaluate DeltaNet & EFLA
python mnist.py
- We sincerely thank flash-linear-attention for the high quality training framework.
- We also extend our heartfelt thanks to lm-evaluation-harness for their evaluation framework.