-
Notifications
You must be signed in to change notification settings - Fork 300
Updated the Technical Note for WY of DPLR #562
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. WalkthroughThe README's "Efficient Chunkwise Implementation" now includes a self-contained derivation of WY representations for DPLR (with diagonal D_t): definitions (Γ_i^t, w_i, u_i), base cases, induction steps, and final WY expressions; contact line added. No code, API, or runtime changes. Changes
Sequence Diagram(s)Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary of Changes
Hello @phnazari, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
This pull request updates a technical note related to the WY representation of products of DPLR matrices. The update addresses minor theoretical analysis mistakes, such as index-mismatches and mix-ups in matrix multiplications, within the note itself. It's important to note that these corrections are specific to the theoretical document and do not imply any issues with the existing implementation, which is believed to be correct.
Highlights
- Documentation Update: The primary change involves updating the link to the technical note in the
README.mdfile, pointing to a revised version that corrects theoretical analysis errors.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.
| Feature | Command | Description |
|---|---|---|
| Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
| Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
| Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
| Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request updates the link to a technical note. The change itself is correct. My review includes a suggestion to improve the long-term maintainability of the project's documentation by hosting it within the repository itself, rather than relying on an external Google Drive link which may become inaccessible in the future.
| ## Efficient Chunkwise Implementation | ||
|
|
||
| For detailed information about efficient chunkwise implementation, please refer to our [technical note](https://drive.google.com/file/d/1rJbO3dU4fe7OKG3w7Yg058z_BNIuavNF/view?usp=sharing). | ||
| For detailed information about efficient chunkwise implementation, please refer to our [technical note](https://drive.google.com/file/d/1qqc6THTRc2bw-LtwsbGNxNDw00sNzi5M/view?usp=sharing). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hosting technical documentation on Google Drive can be fragile for a public project. Links can break, access permissions might change, or the file could be deleted, making it inaccessible to future users and contributors.
For better long-term stability and to keep documentation versioned alongside the code, consider committing the technical note directly into the repository, for instance, within a docs/ directory.
|
@phnazari Thank you for your contribution. Would you mind including the original author’s name and a link to the source in your PDF? Please also describe any additional contributions you have made. If you could provide the original Markdown file or any supplementary derivations, I would be immensely grateful. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🔭 Outside diff range comments (1)
fla/ops/generalized_delta_rule/README.md (1)
1-1: Remove trailing whitespace in documentation to fix CIThere is a trailing space on line 74 of
fla/ops/generalized_delta_rule/README.mdcausing the pre-commit hook to fail:
- fla/ops/generalized_delta_rule/README.md:74
- where we used $\mathbf \Gamma_{t+1}^t = \mathbf I$ in the last step. + where we used $\mathbf \Gamma_{t+1}^t = \mathbf I$ in the last step.After applying this change, run:
pre-commit install pre-commit run --all-filesto update your commit and restore a passing CI build.
♻️ Duplicate comments (1)
fla/ops/generalized_delta_rule/README.md (1)
35-39: Self-contained derivation is a big improvement; consider fully migrating away from external Drive dependencies.This change already mitigates the previous fragility concern. To fully close the loop, commit any original note/derivations to
docs/so everything is versioned with the code.
🧹 Nitpick comments (6)
fla/ops/generalized_delta_rule/README.md (6)
63-64: Use consistent boldface for matrices in the base case.D_1 should be bold to match notation elsewhere.
-We proceed by induction. The base case is quickly established for $t=1$, considering that $\mathbf \Gamma_1^1 = D_1$ and $\mathbf \Gamma_2^1 = \mathbf I$. +We proceed by induction. The base case is quickly established for $t=1$, considering that $\mathbf \Gamma_1^1 = \mathbf D_1$ and $\mathbf \Gamma_2^1 = \mathbf I$.
43-44: Correct matrix dimension notation.Use d×d rather than d, d.
-for vectors $\mathbf a_t, \mathbf b_t, \mathbf v_t, \mathbf k_t \in \mathbb R^d$ and matrices $\mathbf D_t \in \mathbb R^{d, d}$. +for vectors $\mathbf a_t, \mathbf b_t, \mathbf v_t, \mathbf k_t \in \mathbb R^d$ and matrices $\mathbf D_t \in \mathbb R^{d \times d}$.
50-53: Avoid nesting equation environments inside display math.Using \begin{equation*} inside
$$…$$ is redundant and can break rendering. Keep the $$ block only.-\begin{equation*} \mathbf P_t = \mathbf \Gamma_1^t + \left( \sum_{i=1}^t \mathbf w_i \mathbf b_i^\top \mathbf \Gamma_{i+1}^{t} \right) -\end{equation*}
36-36: Style nit: “re-do” → “redo”.Minor wording cleanup.
-The original [technical note](https://drive.google.com/file/d/1qqc6THTRc2bw-LtwsbGNxNDw00sNzi5M/view?usp=sharing) on chunking DPLR contains minor mathematical inconsistencies. Below, we re-do the computations. +The original [technical note](https://drive.google.com/file/d/1qqc6THTRc2bw-LtwsbGNxNDw00sNzi5M/view?usp=sharing) on chunking DPLR contains minor mathematical inconsistencies. Below, we redo the computations.
36-46: Add attribution and references per maintainer request.Maintainer requested: include original author’s name and source, list your additional contributions, and provide original Markdown/supplementary derivations. Suggest adding a short section at the end of this subsection.
Proposed insertion after Line 99:
+### References and Attribution + +- Original technical note: <ADD AUTHOR NAME(S)>, “<ADD TITLE>,” <ADD YEAR>. Link: https://drive.google.com/file/d/1qqc6THTRc2bw-LtwsbGNxNDw00sNzi5M/view +- Additional contributions in this PR (by <ADD YOUR NAME>): corrected index mismatches, fixed matrix multiplication order, and provided a self-contained WY derivation for DPLR with diagonal $\mathbf D_t$. +- Supplementary materials: please include the original Markdown or a derivation appendix under `docs/` (e.g., `docs/dplr_wy_derivation.md`) for versioned, in-repo access.If you provide the author details and preferred filenames, I can prepare a follow-up commit.
66-73: Optional: standardize math environments throughout.You mix fenced
math blocks (earlier) and $$…$$ (here), sometimes wrapping LaTeX environments (align*) inside $$…$$. Standardize to a single approach to improve rendering across viewers (e.g., use fencedmath consistently and avoid nesting environments inside $$).
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
-
fla/ops/generalized_delta_rule/README.md(1 hunks)
🧰 Additional context used
🪛 LanguageTool
fla/ops/generalized_delta_rule/README.md
[grammar] ~39-~39: There might be a mistake here.
Context: ...iciently compute the DPLR representation $$ \mathbf S_t = \mathbf S_{t-1} \le...
(QB_NEW_EN)
[grammar] ~40-~40: There might be a mistake here.
Context: ...ently compute the DPLR representation $$ \mathbf S_t = \mathbf S_{t-1} \left( \ma...
(QB_NEW_EN)
[grammar] ~41-~41: There might be a mistake here.
Context: ...p \right) + \mathbf v_t \mathbf k_t^\top $$ for vectors $\mathbf a_t, \mathbf b_t...
(QB_NEW_EN)
[grammar] ~42-~42: There might be a mistake here.
Context: ...right) + \mathbf v_t \mathbf k_t^\top $$ for vectors $\mathbf a_t, \mathbf b_t, ...
(QB_NEW_EN)
[grammar] ~59-~59: There might be a mistake here.
Context: ...\mathbf a_i, & i \geq 2. \end{cases} $$ where we define $\mathbf \Gamma_m^{n}...
(QB_NEW_EN)
[grammar] ~60-~60: There might be a mistake here.
Context: ...thbf a_i, & i \geq 2. \end{cases} $$ where we define $\mathbf \Gamma_m^{n} \c...
(QB_NEW_EN)
[grammar] ~86-~86: There might be a mistake here.
Context: ...\mathbf a_i, & i \geq 2. \end{cases} $$ We again show this claim by induction...
(QB_NEW_EN)
[grammar] ~87-~87: There might be a mistake here.
Context: ...thbf a_i, & i \geq 2. \end{cases} $$ We again show this claim by induction. T...
(QB_NEW_EN)
🪛 GitHub Actions: lint
fla/ops/generalized_delta_rule/README.md
[error] 1-1: Trailing whitespace detected by pre-commit hook 'trailing-whitespace' during 'pre-commit run --files fla/ops/generalized_delta_rule/README.md'. The file was auto-fixed; please re-run pre-commit to confirm all checks pass.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Test H100 (PyTorch 2.7) / test-models
- GitHub Check: check-pt-python-compatibility
🔇 Additional comments (1)
fla/ops/generalized_delta_rule/README.md (1)
47-61: Mathematical content LGTM.Beyond the noted nits and index fixes, the WY forms for P_t and S_t are clear and align with the intended DPLR recurrence, including base cases and induction structure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (3)
fla/ops/generalized_delta_rule/README.md (3)
36-39: Add provenance/attribution + avoid bare email; consider de-relying on Google Drive.
- Add original author attribution, link to source, and a short note of your contributions as requested in review.
- Replace bare email with a mailto link to satisfy markdownlint MD034.
- Long-term, avoid external Drive links for critical docs; mirror or move into repo under docs/. (Echoing earlier feedback.)
Proposed edits:
-If you have questions about or comments on the below derivations, feel free to reach out: [email protected]. +If you have questions about or comments on the below derivations, feel free to reach out: [[email protected]](mailto:[email protected]).Optionally insert an attribution block (fill in names/links):
+#### Provenance and acknowledgments + +This section builds on an earlier technical note by <ORIGINAL AUTHOR(S)>, available at +[link](https://drive.google.com/file/d/1qqc6THTRc2bw-LtwsbGNxNDw00sNzi5M/view?usp=sharing). +In this PR, we: +- Correct minor inconsistencies (indexing, transpose, and grouping). +- Integrate a self-contained derivation in-repo. +- Clarify base cases and induction steps. + +If permissible, consider mirroring the original note (PDF/Markdown) under docs/ for archival and versioning.Would you like me to open a follow-up PR to add a docs/ note and populate the attribution from your sources?
70-76: Indices and identity boundary case are now correct.The induction step for
$\mathbf P_{t+1}$ uses$\mathbf \Gamma_{i+1}^{t+1}$ and the boundary$\mathbf \Gamma_{t+2}^{t+1}=\mathbf I$ , addressing the prior mismatch.
78-101: Derivation for S_t looks correct; fixes to base case, transpose, and grouping are in place.
- Base case uses
$\mathbf \Gamma_2^1=\mathbf I$ .- Transpose on
$\mathbf k_{t+1}$ present.- Parentheses around
$(\mathbf v_i \mathbf k_i^\top + \mathbf u_i \mathbf b_i^\top)$ are balanced.
🧹 Nitpick comments (5)
fla/ops/generalized_delta_rule/README.md (5)
5-7: Unify transpose notation to use \top consistently.Current sections use both T and \top. Prefer \top throughout for consistency with later derivations.
-\mathbf{S}_t = \mathbf{S}_{t-1}(\mathbf{I}-\beta_t \mathbf{k}_t\mathbf{k}_t^T) + \beta_t \mathbf{v}_t\mathbf{k}_t^T +\mathbf{S}_t = \mathbf{S}_{t-1}(\mathbf{I}-\beta_t \mathbf{k}_t\mathbf{k}_t^\top) + \beta_t \mathbf{v}_t\mathbf{k}_t^\top-\mathbf{S}_t = \mathbf{S}_{t-1}(\mathbf{I}+\mathbf{a}_t\mathbf{b}_t^T) + \mathbf{v}_t\mathbf{k}_t^T +\mathbf{S}_t = \mathbf{S}_{t-1}(\mathbf{I}+\mathbf{a}_t\mathbf{b}_t^\top) + \mathbf{v}_t\mathbf{k}_t^\top-\mathbf{S}_t = \mathbf{S}_{t-1}(\mathbf{D}_t+\mathbf{a}_t\mathbf{b}_t^T) + \mathbf{v}_t\mathbf{k}_t^T +\mathbf{S}_t = \mathbf{S}_{t-1}(\mathbf{D}_t+\mathbf{a}_t\mathbf{b}_t^\top) + \mathbf{v}_t\mathbf{k}_t^\topAlso applies to: 15-17, 29-31
41-46: Typo/notation: use d × d instead of d, d for matrix shape.Minor LaTeX/notation nit.
-for vectors $\mathbf a_t, \mathbf b_t, \mathbf v_t, \mathbf k_t \in \mathbb R^d$ and matrices $\mathbf D_t \in \mathbb R^{d, d}$. +for vectors $\mathbf a_t, \mathbf b_t, \mathbf v_t, \mathbf k_t \in \mathbb R^d$ and matrices $\mathbf D_t \in \mathbb R^{d \times d}$.
49-56: Define P_t explicitly before giving its WY form.Add the product definition to make the section self-contained.
-### $WY$ Representation for $P_t$ +### $WY$ Representation for $P_t$ +Let $\displaystyle \mathbf P_t \coloneqq \prod_{i=1}^t \left(\mathbf D_i + \mathbf a_i \mathbf b_i^\top\right)$. Let $\mathbf \Gamma_i^t \coloneqq \prod_{j=i}^t \mathbf D_j$. Then
65-66: Notation consistency: boldface D_1.Keep symbols bold throughout.
-We proceed by induction. The base case is quickly established for $t=1$, considering that $\mathbf \Gamma_1^1 = D_1$ and $\mathbf \Gamma_2^1 = \mathbf I$. +We proceed by induction. The base case is quickly established for $t=1$, considering that $\mathbf \Gamma_1^1 = \mathbf D_1$ and $\mathbf \Gamma_2^1 = \mathbf I$.
9-10: Wording nit: clarify “I is not necessarily an identity matrix.”“I” by definition denotes the identity. Suggest rephrase to “the transition matrix is not necessarily the identity; …”
-This repository implements a delta rule variant where $\mathbf{I}$ is not necessarily an identity matrix; $\mathbf{k}_t$ in $\mathbf{I} - \beta_t \mathbf{k}_t\mathbf{k}_t^T$ might be different from input $\mathbf{k}_t$ in $\mathbf{v}_t\mathbf{k}_t^T$. +This repository implements a delta rule variant where the transition matrix is not necessarily the identity; $\mathbf{k}_t$ in $\mathbf{I} - \beta_t \mathbf{k}_t\mathbf{k}_t^\top$ might be different from the input $\mathbf{k}_t$ in $\mathbf{v}_t\mathbf{k}_t^\top$.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
-
fla/ops/generalized_delta_rule/README.md(1 hunks)
🧰 Additional context used
🪛 LanguageTool
fla/ops/generalized_delta_rule/README.md
[grammar] ~41-~41: There might be a mistake here.
Context: ...iciently compute the DPLR representation $$ \mathbf S_t = \mathbf S_{t-1} \le...
(QB_NEW_EN)
[grammar] ~42-~42: There might be a mistake here.
Context: ...ently compute the DPLR representation $$ \mathbf S_t = \mathbf S_{t-1} \left( \ma...
(QB_NEW_EN)
[grammar] ~43-~43: There might be a mistake here.
Context: ...p \right) + \mathbf v_t \mathbf k_t^\top $$ for vectors $\mathbf a_t, \mathbf b_t...
(QB_NEW_EN)
[grammar] ~44-~44: There might be a mistake here.
Context: ...right) + \mathbf v_t \mathbf k_t^\top $$ for vectors $\mathbf a_t, \mathbf b_t, ...
(QB_NEW_EN)
[grammar] ~61-~61: There might be a mistake here.
Context: ...\mathbf a_i, & i \geq 2. \end{cases} $$ where we define $\mathbf \Gamma_m^{n}...
(QB_NEW_EN)
[grammar] ~62-~62: There might be a mistake here.
Context: ...thbf a_i, & i \geq 2. \end{cases} $$ where we define $\mathbf \Gamma_m^{n} \c...
(QB_NEW_EN)
[grammar] ~88-~88: There might be a mistake here.
Context: ...\mathbf a_i, & i \geq 2. \end{cases} $$ We again show this claim by induction...
(QB_NEW_EN)
[grammar] ~89-~89: There might be a mistake here.
Context: ...thbf a_i, & i \geq 2. \end{cases} $$ We again show this claim by induction. T...
(QB_NEW_EN)
🪛 markdownlint-cli2 (0.17.2)
fla/ops/generalized_delta_rule/README.md
38-38: Bare URL used
(MD034, no-bare-urls)
🪛 GitHub Actions: lint
fla/ops/generalized_delta_rule/README.md
[error] 1-1: Trailing whitespace detected by pre-commit hook 'trailing-whitespace'; the file was updated to fix trailing spaces. Please review and commit the changes.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: check-pt-python-compatibility
🔇 Additional comments (1)
fla/ops/generalized_delta_rule/README.md (1)
1-1: Note on CI: trailing whitespace was auto-fixed.Pre-commit updated the file to remove trailing spaces. Make sure to pull/merge those changes so CI stays green.
|
I have now updated my pull-request to include the computations in the README.md instead of the link to the drive (which is not really future proof). It also contains a link to the original derivations by Songlin. I believe the mistakes in the previous derivations are:
Please let me know if you find any mistakes in my derivations though, I would be happy to fix them! :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
fla/ops/generalized_delta_rule/README.md (1)
1-101: Remove trailing whitespace on line 76 of README.md.Line 76 contains trailing whitespace after "the last step. " that must be removed to pass the pre-commit hook.
🧹 Nitpick comments (1)
fla/ops/generalized_delta_rule/README.md (1)
36-36: Attribution and linter flag.Line 36 provides a hyperlink to the original technical note, which addresses the need for sourcing. However, the PR review requested explicit mention of the original author's name; consider adding inline text (e.g., "original technical note by Songlin") for clarity.
Line 38's email address triggers a false-positive linter warning (markdownlint treats it as a bare URL). This is a known limitation of the linter for email addresses; no code change is strictly necessary, but you may optionally wrap it in brackets or restructure the sentence to suppress the warning.
Apply this optional diff to add explicit author attribution and suppress the linter warning:
-The original [technical note](https://drive.google.com/file/d/1qqc6THTRc2bw-LtwsbGNxNDw00sNzi5M/view?usp=sharing) on chunking DPLR contains minor mathematical inconsistencies. Below, we re-do the computations. +The original [technical note](https://drive.google.com/file/d/1qqc6THTRc2bw-LtwsbGNxNDw00sNzi5M/view?usp=sharing) by Songlin on chunking DPLR contains minor mathematical inconsistencies. Below, we re-do the computations.-If you have questions about or comments on the below derivations, feel free to reach out: [email protected]. +If you have questions about or comments on the below derivations, please reach out to [[email protected]](mailto:[email protected]).Also applies to: 38-38
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
-
fla/ops/generalized_delta_rule/README.md(1 hunks)
🧰 Additional context used
🪛 GitHub Actions: lint
fla/ops/generalized_delta_rule/README.md
[error] 1-1: Trailing whitespace detected. The pre-commit hook modified the file to remove trailing spaces. Please review and commit the changes.
🪛 LanguageTool
fla/ops/generalized_delta_rule/README.md
[grammar] ~41-~41: There might be a mistake here.
Context: ...iciently compute the DPLR representation $$ \mathbf S_t = \mathbf S_{t-1} \le...
(QB_NEW_EN)
[grammar] ~42-~42: There might be a mistake here.
Context: ...ently compute the DPLR representation $$ \mathbf S_t = \mathbf S_{t-1} \left( \ma...
(QB_NEW_EN)
[grammar] ~43-~43: There might be a mistake here.
Context: ...p \right) + \mathbf v_t \mathbf k_t^\top $$ for vectors $\mathbf a_t, \mathbf b_t...
(QB_NEW_EN)
[grammar] ~44-~44: There might be a mistake here.
Context: ...right) + \mathbf v_t \mathbf k_t^\top $$ for vectors $\mathbf a_t, \mathbf b_t, ...
(QB_NEW_EN)
[grammar] ~78-~78: There might be a mistake here.
Context: ...tep. ###
(QB_NEW_EN)
🪛 markdownlint-cli2 (0.18.1)
fla/ops/generalized_delta_rule/README.md
38-38: Bare URL used
(MD034, no-bare-urls)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: check-pt-python-compatibility
🔇 Additional comments (2)
fla/ops/generalized_delta_rule/README.md (2)
36-76: Mathematical correctness of P_t derivation is solid.The induction proof correctly carries the fixed indices throughout:
- Base case at line 65: Γ₂¹ = I (not Γ₂⁰) ✓
- Induction result at line 73: Γᵢ₊₁^(t+1) (not Γᵢᵗ) ✓
- Identity at line 76: Γ_(t+2)^(t+1) = I (not Γ_(t+1)ᵗ) ✓
This addresses all index and commutativity corrections from prior reviews.
78-101: WY representation for S_t is correct.The derivation correctly includes:
- Proper transposes on k_i and k_(t+1) throughout (lines 95, 97) ✓
- Balanced parentheses around (v_i k_i^⊤ + u_i b_i^⊤) before multiplying by Γ and a (line 97) ✓
- Correct base case at line 90: u₁ = 0 and Γ₂¹ = I ✓
- Consistent use of Γ_(i+1)^(t+1) indices ✓
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
-
fla/ops/generalized_delta_rule/README.md(1 hunks)
🧰 Additional context used
🪛 GitHub Actions: lint
fla/ops/generalized_delta_rule/README.md
[error] 1-1: Pre-commit hook 'trailing-whitespace' failed. The hook modified the file to remove trailing whitespace.
[error] 1-1: Pre-commit hook 'end-of-file-fixer' failed. The hook modified the file to ensure proper end-of-file newline.
🪛 LanguageTool
fla/ops/generalized_delta_rule/README.md
[grammar] ~41-~41: There might be a mistake here.
Context: ...iciently compute the DPLR representation $$ \mathbf S_t = \mathbf S_{t-1} \le...
(QB_NEW_EN)
[grammar] ~42-~42: There might be a mistake here.
Context: ...ently compute the DPLR representation $$ \mathbf S_t = \mathbf S_{t-1} \left( \ma...
(QB_NEW_EN)
[grammar] ~43-~43: There might be a mistake here.
Context: ...p \right) + \mathbf v_t \mathbf k_t^\top $$ for vectors $\mathbf a_t, \mathbf b_t...
(QB_NEW_EN)
[grammar] ~44-~44: There might be a mistake here.
Context: ...right) + \mathbf v_t \mathbf k_t^\top $$ for vectors $\mathbf a_t, \mathbf b_t, ...
(QB_NEW_EN)
[grammar] ~78-~78: There might be a mistake here.
Context: ...tep. ###
(QB_NEW_EN)
🪛 markdownlint-cli2 (0.18.1)
fla/ops/generalized_delta_rule/README.md
38-38: Bare URL used
(MD034, no-bare-urls)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Test H100 (PyTorch 2.7) / test-ops
🔇 Additional comments (1)
fla/ops/generalized_delta_rule/README.md (1)
49-76: Mathematical derivations for P_t appear consistent with prior review corrections.The WY representation for P_t, including the base case (line 65) and induction step (lines 68–76), correctly incorporates the fixes flagged in previous reviews:
- Line 73 correctly uses Γ_{i+1}^{t+1} (not Γ_i^t)
- Line 76 correctly references Γ_{t+2}^{t+1} = I (not Γ_{t+1}^t)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
fla/ops/generalized_delta_rule/README.md (1)
1-101: Run pre-commit locally to fix trailing whitespace and end-of-file issues.The pipeline reported trailing whitespace and end-of-file-fixer failures. These are formatting issues introduced by pre-commit hooks. Run the pre-commit checks locally to auto-fix whitespace and ensure proper file termination:
pre-commit run --all-filesThen commit the corrections.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
-
fla/ops/generalized_delta_rule/README.md(1 hunks)
🧰 Additional context used
🪛 GitHub Actions: lint
fla/ops/generalized_delta_rule/README.md
[error] 1-1: Trailing whitespace detected. Hooks modified the file; please re-run pre-commit to verify and commit changes.
[error] 1-1: End-of-file-fixer failed. Hooks modified the file; please re-run pre-commit to verify.
🪛 LanguageTool
fla/ops/generalized_delta_rule/README.md
[grammar] ~41-~41: There might be a mistake here.
Context: ...iciently compute the DPLR representation $$ \mathbf S_t = \mathbf S_{t-1} \le...
(QB_NEW_EN)
[grammar] ~42-~42: There might be a mistake here.
Context: ...ently compute the DPLR representation $$ \mathbf S_t = \mathbf S_{t-1} \left( \ma...
(QB_NEW_EN)
[grammar] ~43-~43: There might be a mistake here.
Context: ...p \right) + \mathbf v_t \mathbf k_t^\top $$ for vectors $\mathbf a_t, \mathbf b_t...
(QB_NEW_EN)
[grammar] ~44-~44: There might be a mistake here.
Context: ...right) + \mathbf v_t \mathbf k_t^\top $$ for vectors $\mathbf a_t, \mathbf b_t, ...
(QB_NEW_EN)
[grammar] ~78-~78: There might be a mistake here.
Context: ...tep. ###
(QB_NEW_EN)
🪛 markdownlint-cli2 (0.18.1)
fla/ops/generalized_delta_rule/README.md
38-38: Bare URL used
(MD034, no-bare-urls)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Test H100 (PyTorch 2.7) / test-ops
🔇 Additional comments (2)
fla/ops/generalized_delta_rule/README.md (2)
49-76: WY representation for P_t is mathematically sound.The derivation correctly establishes the base case (t=1) and induction step, with consistent indexing of Γ_i^t factors. The final result and use of the boundary condition Γ_{t+2}^{t+1} = I are properly justified.
78-101: WY representation for S_t is mathematically sound and well-formatted.The derivation correctly establishes the base case, defines u_i coefficients, and derives the induction step with proper transpose notation (k_{t+1}^\top) and balanced parentheses. The final result and boundary condition are consistent with the P_t section.
|
Fixed parsing of LaTeX, linting issues and contact info. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
fla/ops/generalized_delta_rule/README.md (1)
38-38: Minor wording redundancy: remove repeated preposition.Line 38 repeats "about": "questions about or comments about" is awkward. Simplify to "questions or comments about."
-If you have questions about or comments about the below derivations, feel free to [reach out](https://phnazari.github.io). +If you have questions or comments about the below derivations, feel free to [reach out](https://phnazari.github.io).
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
-
fla/ops/generalized_delta_rule/README.md(1 hunks)
🧰 Additional context used
🪛 LanguageTool
fla/ops/generalized_delta_rule/README.md
[style] ~38-~38: Try using a synonym here to strengthen your wording.
Context: ...ations. If you have questions about or comments about the below derivations, feel free ...
(COMMENT_REMARK)
[grammar] ~40-~40: There might be a mistake here.
Context: ...iciently compute the DPLR representation $$ \mathbf S_t = \mathbf S_{t-1} \le...
(QB_NEW_EN)
[grammar] ~41-~41: There might be a mistake here.
Context: ...ently compute the DPLR representation $$ \mathbf S_t = \mathbf S_{t-1} \left( \ma...
(QB_NEW_EN)
[grammar] ~42-~42: There might be a mistake here.
Context: ...p \right) + \mathbf v_t \mathbf k_t^\top $$ for vectors $\mathbf a_t, \mathbf b_t...
(QB_NEW_EN)
[grammar] ~43-~43: There might be a mistake here.
Context: ...right) + \mathbf v_t \mathbf k_t^\top $$ for vectors $\mathbf a_t, \mathbf b_t, ...
(QB_NEW_EN)
[grammar] ~77-~77: There might be a mistake here.
Context: ...step. ###
(QB_NEW_EN)
🔇 Additional comments (2)
fla/ops/generalized_delta_rule/README.md (2)
48-100: Mathematical derivations look solid; prior issues resolved.The WY representation derivations for P_t (lines 48–75) and S_t (lines 77–100) are well-structured with clear base cases, induction steps, and consistent index notation. Spot-checks confirm that prior issues flagged in earlier reviews—index corrections (Γ_{i+1}^{t+1}), transpose on k_{t+1}^\top, and base case references (Γ_2^1)—have been addressed correctly.
The self-contained derivations effectively replace the prior external-link approach and should help maintainability going forward.
36-36: Verify Songlin attribution is clear per prior review request.Line 36 references "The original technical note" via a Google Drive link. Per the PR objectives, an earlier reviewer (zhiyuan1i) requested that the contributor include the original author's name and link. While the linked document presumably contains Songlin's work, the text does not explicitly name the original author.
Consider clarifying the attribution here or confirming it appears elsewhere in the README (e.g., in a References or Acknowledgments section).
a4e1a1c to
90a0fea
Compare
1700f8d to
f4082b3
Compare
I have updated the technical note for the WY representation of products of DPLR matrices. I believe there were some minor mistakes in the existing theoretical analysis, like index-mismatches and mix-ups of matrix multiplications.
The mistakes do not translate to mistakes in the implementation, which might very well be correct!!!
Summary by CodeRabbit