Skip to content
This repository has been archived by the owner on Feb 3, 2025. It is now read-only.

Feature request: Linkify code symbols in the model response #54

Open
iuliaturc opened this issue Oct 4, 2024 · 0 comments
Open

Feature request: Linkify code symbols in the model response #54

iuliaturc opened this issue Oct 4, 2024 · 0 comments
Labels
🔥 difficulty: 3 Moderate difficulty hacktoberfest Open to Hacktoberfest contributions

Comments

@iuliaturc
Copy link
Contributor

Given an LLM response that contains a code symbol (e.g. class name, method, etc.), we should link it to its exact location on GitHub. For instance, if you're chatting with Hugging Face's Transformers library:

  • Original response: "To define a BERT model using the Hugging Face Transformers library, you can use the BertModel class."

  • Link-ified response: "To define a BERT model using the Hugging Face Transformers library, you can use the BertModel class."

This should be done as a post-processing step in chat.py, once the model is done streaming.

Work items:

  1. Produce an AST (Abstract Syntax Tree) for every file in the repository. This can be done via the tree_sitter library, which we're already using in chunk.py.
  2. Identify class / method / file names in the model response, potentially using regular expressions.
  3. Look up the strings matched by the regular expressions in the per-file ASTs.
  4. Map the code symbol to a github URL.
@iuliaturc iuliaturc added hacktoberfest Open to Hacktoberfest contributions 🔥 difficulty: 3 Moderate difficulty labels Oct 4, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
🔥 difficulty: 3 Moderate difficulty hacktoberfest Open to Hacktoberfest contributions
Projects
None yet
Development

No branches or pull requests

1 participant