diff --git a/.spellcheck-en-custom.txt b/.spellcheck-en-custom.txt index 385f076..c28ac75 100644 --- a/.spellcheck-en-custom.txt +++ b/.spellcheck-en-custom.txt @@ -6,6 +6,7 @@ agentic Akash AMDGPU Anil +architected arge args arXiv @@ -138,6 +139,7 @@ Miniforge MiniLM Mixtral mixtral +modifiability MLX MMLU modularize @@ -242,6 +244,7 @@ tatsu TBD templating Tesla +testability th tl TODO @@ -253,6 +256,7 @@ Triagers triagers UI ui +understandability unquantized unstaged URI diff --git a/docs/adr-chat-module-architecture.md b/docs/adr-chat-module-architecture.md new file mode 100644 index 0000000..ef71d51 --- /dev/null +++ b/docs/adr-chat-module-architecture.md @@ -0,0 +1,31 @@ +# InstructLab Chat Module Architecture + +## Context + +InstructLab recently gained the ability of [retrieval augmented generation (RAG) for chat](https://github.com/instructlab/instructlab/pull/2886). The initial implementation is the simplest possible, inserting retrieved document chunks in the chat session history before submitting the user query to the target model. This is done by adding a retriever instance into the chat class which is called by submitting the user query as-is to the vector store as a search query to retrieve those chunks. + +Tuning retrieval systems requires significant expertise and testing, and there are many techniques that may be needed specifically and only for that use case. For example, query reformulation or agentic approaches may be necessary to achieve reliable retrieval quality while being redundant and expensive in a non-retrieval scenario. + +Having a single code path that conditionally executes various strategies based on a mixture of configuration, application state, and user input can quickly become difficult to maintain, debug, and even understand. That chat module in InstructLab would benefit from a principled approach; that is, to be purposefully architected before becoming a [big ball of mud](http://www.laputan.org/mud/?ref=blog.codinghorror.com). + +Prompt template management, model interaction pattern, chat session history, and any interactions with third party systems can vary by use case and so should be logically grouped for purposes of at least maintainability, understandability, modifiability, and testability. + +## Decision + +The InstructLab chat module will adopt a [strategy pattern](https://en.wikipedia.org/wiki/Strategy_pattern). + +## Status + +Accepted + +## Consequences + +* A refactor to the chat module will be necessary before additional feature work. +* Encapsulation of different sets of logic as distinct pattern should decrease the risk of regression in one area when implementing another. +* Division of responsibility will be clear from the code and project structure. +* Testing different chat strategies will become simpler, with a smaller total testing surface. +* Development velocity of RAG-specific improvements should be higher. +* Code review will become simpler. +* Risk of unsustainable maintenance overhead should be decreased. +* As blocks of logic common to multiple strategies emerge over time, there will be a natural path to a pipeline approach in the future. +* Configuration might become more complex; care should be taken in designing it to be flexible and to avoid one-way doors.