Skip to content

[FEATURE]: Recursive Language Model as alternative to Compaction #8455

@tschmidt-code

Description

@tschmidt-code

Feature hasn't been suggested before.

  • I have verified this feature I'm about to request hasn't been suggested before.

I believe this is the same as PR #6795, but I saw that was closed without comment. I wonder if perhaps it was closed because people didn't have context on what the RLM concept is for. I thought I would provide more context on what this is for in an intuitive sense. The blog post and paper are pretty accessible and I encourage you to check it out if interested, but here's my summary of how I imagine this applying to CC/OC:

Describe the enhancement you want to request

The RLM approach seems like an improvement on the compaction approach that Claude Code takes for managing context windows.

The paper is borrowing some of ideas that CC uses on code and applying them to non-coding large context tasks like research. In particular, it takes the concept of how CC doesn't read every file in your repo into the context at the beginning, but instead just treats them as files (variables in the paper) that the agent or sub-agents can explore if they think it will be useful.

But then then the paper improves on the compaction approach that CC takes (CC does lossy compression on your context window whenever it runs out of space) and instead replaces that with storing the entire context window in a variable (equivalent to dumping the context to a file in CC instead of compressing it). Then sub-LLMs can explore that context variable (file) however they want (in chunked fashion) rather than working from a highly lossy, compressed version of it.

Whereas before CC takes your conversation history and all the files it has seen and summarizes that, the paper suggests just dumping everything to a file for later exploration. If CC wanted to implement this, they would just turn off compaction and replace it with dumping the context window into a file and having sub-agents first explore the old context for the relevant information for whatever prompt the user gives. Might add some latency to the response but I suspect it wouldn't be that much.

The paper suggests that this works with off-the-shelf models, but that they're not particularly efficient and do goofy things in their exploration; and suggests that Anthropic could spend time RL-ing the model to do this context mining more efficiently.

Seems to me that could enable "never reset your context window" type user experiences with minimal cost increase, which is to say, basically unbounded memory. So CC would be able to look at work you did 12 months ago and remind you why you did it at the time for example. Or it would be able to perform long-running tasks without a custom harnesses (which was the approach that Anthropic took and described in a blog post here).

Metadata

Metadata

Assignees

Labels

discussionUsed for feature requests, proposals, ideas, etc. Open discussion

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions