Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add configurable prompts #35

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Add configurable prompts #35

wants to merge 2 commits into from

Conversation

galshubeli
Copy link
Contributor

@galshubeli galshubeli commented Nov 14, 2024

Summary by CodeRabbit

  • New Features

    • Enhanced ChatSession class to support custom instructions and prompts for Cypher and QA sessions.
    • Updated KnowledgeGraph class to allow additional parameters for initiating chat sessions.
    • Introduced custom prompts in the GraphQueryGenerationStep and QAStep classes for improved flexibility.
  • Bug Fixes

    • Improved logic for handling prompts in GraphQueryGenerationStep and QAStep to streamline prompt generation.
  • Documentation

    • Updated method signatures to reflect new optional parameters across relevant classes.

Copy link

coderabbitai bot commented Nov 14, 2024

Walkthrough

The changes involve modifications to several classes within the graphrag_sdk module to enhance the configurability of chat sessions and query steps. The ChatSession class constructor now accepts four new optional parameters for custom instructions and prompts. Similarly, the KnowledgeGraph class's chat_session method has been updated to utilize these parameters. The GraphQueryGenerationStep and QAStep classes also received updates to incorporate optional prompt parameters, simplifying their internal logic for handling prompts.

Changes

File Change Summary
graphrag_sdk/chat_session.py Updated ChatSession constructor to include cypher_system_instruction, qa_system_instruction, cypher_gen_prompt, and qa_prompt. Modified initialization logic for cypher_system_instruction and adjusted send_message.
graphrag_sdk/kg.py Modified KnowledgeGraph class's chat_session method to accept cypher_system_instruction, qa_system_instruction, cypher_gen_prompt, and qa_prompt as parameters when creating a ChatSession object.
graphrag_sdk/steps/graph_query_step.py Added cypher_prompt parameter to GraphQueryGenerationStep constructor. Simplified logic in run method for determining cypher_prompt based on provided or default values.
graphrag_sdk/steps/qa_step.py Introduced qa_prompt parameter in QAStep constructor. Updated run method to utilize qa_prompt for formatting the QA prompt, falling back to GRAPH_QA_PROMPT if not provided.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant KnowledgeGraph
    participant ChatSession
    participant GraphQueryGenerationStep
    participant QAStep

    User->>KnowledgeGraph: chat_session(cypher_system_instruction, qa_system_instruction, cypher_gen_prompt, qa_prompt)
    KnowledgeGraph->>ChatSession: Create with provided parameters
    ChatSession->>GraphQueryGenerationStep: Run with cypher_prompt
    ChatSession->>QAStep: Run with qa_prompt
Loading

🐰 "In the realm of code where rabbits hop,
New prompts and systems, we never stop!
With cyphers and QA, our chat's now a breeze,
Custom instructions bring us to our knees!
So let’s celebrate this coding delight,
For every change makes our future bright!" 🐇


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between f24594c and d07f97e.

📒 Files selected for processing (1)
  • graphrag_sdk/steps/qa_step.py (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • graphrag_sdk/steps/qa_step.py

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Outside diff range and nitpick comments (6)
graphrag_sdk/steps/graph_query_step.py (1)

54-62: Fix indentation consistency.

The indentation in the ternary expressions is inconsistent, which affects readability. Ensure consistent indentation throughout the nested expressions.

-                    cypher_prompt = (
-                        (CYPHER_GEN_PROMPT.format(question=question) 
-                        if self.last_answer is None
-                        else CYPHER_GEN_PROMPT_WITH_HISTORY.format(question=question, last_answer=self.last_answer))
-                        if error is False
-                        else CYPHER_GEN_PROMPT_WITH_ERROR.format(
-                            question=question, error=error
-                        )
-                    )   
+                    cypher_prompt = (
+                        (CYPHER_GEN_PROMPT.format(question=question)
+                         if self.last_answer is None
+                         else CYPHER_GEN_PROMPT_WITH_HISTORY.format(
+                             question=question,
+                             last_answer=self.last_answer
+                         ))
+                        if error is False
+                        else CYPHER_GEN_PROMPT_WITH_ERROR.format(
+                            question=question,
+                            error=error
+                        )
+                    )
graphrag_sdk/chat_session.py (3)

54-55: Add type hints and documentation for new attributes.

The new prompt attributes would benefit from type hints and docstring documentation to improve code maintainability.

Add type hints and update the class docstring:

 class ChatSession:
+    cypher_prompt: str | None
+    qa_prompt: str | None
     """
     Represents a chat session with a Knowledge Graph.

     Args:
         model_config (KnowledgeGraphModelConfig): The model configuration to use.
         ontology (Ontology): The ontology to use.
         graph (Graph): The graph to query.
+        cypher_system_instruction (str, optional): Custom system instructions for Cypher generation.
+        qa_system_instruction (str, optional): Custom system instructions for QA.
+        cypher_gen_prompt (str, optional): Custom prompt template for Cypher generation.
+        qa_prompt (str, optional): Custom prompt template for QA.

63-63: Consider consistent instruction handling approaches.

The QA system instruction handling uses a different approach compared to the Cypher system instruction handling. Consider using the same pattern for consistency.

-            qa_system_instruction or GRAPH_QA_SYSTEM
+        if qa_system_instruction is None:
+            qa_system_instruction = GRAPH_QA_SYSTEM
+        self.qa_chat_session = model_config.qa.with_system_instruction(qa_system_instruction).start_chat()

82-82: Consider caching step instances.

The steps are recreated for each message, which could be inefficient for long chat sessions since most parameters remain constant.

Consider creating the steps once in the constructor:

     def __init__(self, ...):
         # ... existing code ...
+        self.cypher_step = GraphQueryGenerationStep(
+            graph=self.graph,
+            chat_session=self.cypher_chat_session,
+            ontology=self.ontology,
+            cypher_prompt=self.cypher_prompt,
+        )
+        self.qa_step = QAStep(
+            chat_session=self.qa_chat_session,
+            qa_prompt=self.qa_prompt,
+        )

     def send_message(self, message: str):
-        cypher_step = GraphQueryGenerationStep(...)
+        self.cypher_step.last_answer = self.last_answer
+        (context, cypher) = self.cypher_step.run(message)
         # ... rest of the method ...
-        qa_step = QAStep(...)
+        answer = self.qa_step.run(message, cypher, context)

Also applies to: 92-92

graphrag_sdk/kg.py (2)

137-141: Update method docstring to document new parameters

The method signature has been updated with new parameters, but the docstring hasn't been updated to reflect these changes. Please add parameter descriptions to help users understand the purpose of each new parameter.

Apply this diff:

     def chat_session(self, cypher_system_instruction: str = None, qa_system_instruction: str = None,
                 cypher_gen_prompt: str = None, qa_prompt: str = None) -> ChatSession:
+        """
+        Create a new chat session with optional custom instructions and prompts.
+
+        Parameters:
+            cypher_system_instruction (str, optional): Custom system instruction for Cypher query generation
+            qa_system_instruction (str, optional): Custom system instruction for question answering
+            cypher_gen_prompt (str, optional): Custom prompt template for Cypher query generation
+            qa_prompt (str, optional): Custom prompt template for question answering
+
+        Returns:
+            ChatSession: A new chat session instance
+        """
         chat_session = ChatSession(self._model_config, self.ontology, self.graph, cypher_system_instruction,
                                    qa_system_instruction, cypher_gen_prompt, qa_prompt)
         return chat_session

137-141: Consider adding parameter validation and default values

The method accepts optional string parameters but doesn't validate them or provide default values. Consider:

  1. Validating that provided strings are not empty
  2. Using the imported GRAPH_QA_SYSTEM and CYPHER_GEN_SYSTEM as default values

Here's a suggested implementation:

     def chat_session(self, cypher_system_instruction: str = None, qa_system_instruction: str = None,
                 cypher_gen_prompt: str = None, qa_prompt: str = None) -> ChatSession:
+        # Use default system instructions if not provided
+        cypher_system_instruction = cypher_system_instruction or CYPHER_GEN_SYSTEM
+        qa_system_instruction = qa_system_instruction or GRAPH_QA_SYSTEM
+
+        # Validate non-empty strings if provided
+        if cypher_gen_prompt is not None and not cypher_gen_prompt.strip():
+            raise ValueError("cypher_gen_prompt cannot be empty")
+        if qa_prompt is not None and not qa_prompt.strip():
+            raise ValueError("qa_prompt cannot be empty")
+
         chat_session = ChatSession(self._model_config, self.ontology, self.graph, cypher_system_instruction,
                                    qa_system_instruction, cypher_gen_prompt, qa_prompt)
         return chat_session
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between ca3aa2f and f24594c.

📒 Files selected for processing (4)
  • graphrag_sdk/chat_session.py (4 hunks)
  • graphrag_sdk/kg.py (1 hunks)
  • graphrag_sdk/steps/graph_query_step.py (1 hunks)
  • graphrag_sdk/steps/qa_step.py (1 hunks)
🔇 Additional comments (5)
graphrag_sdk/steps/qa_step.py (2)

20-20: LGTM! Clean implementation of configurable prompts

The addition of the optional qa_prompt parameter with proper type hinting and None default maintains backward compatibility while enabling prompt customization.

Also applies to: 24-24


27-32: Verify format string compatibility with custom prompts

The code assumes that any custom qa_prompt will contain the format placeholders {context}, {cypher}, and {question}. If a custom prompt is provided without these placeholders, it will raise a KeyError at runtime.

Let's verify the format string requirements are documented:

Consider adding validation for required placeholders:

     def run(self, question: str, cypher: str, context: str):
         graph_qa_prompt = self.qa_prompt or GRAPH_QA_SYSTEM
+        required_placeholders = ["{context}", "{cypher}", "{question}"]
+        if self.qa_prompt and not all(p in self.qa_prompt for p in required_placeholders):
+            raise ValueError(
+                f"Custom qa_prompt must contain all placeholders: {required_placeholders}"
+            )
         qa_prompt = graph_qa_prompt.format(
             context=context, cypher=cypher, question=question
         )
graphrag_sdk/steps/graph_query_step.py (1)

36-36: LGTM! Clean implementation of the configurable prompt.

The addition of the optional cypher_prompt parameter and its corresponding instance variable is well-implemented, following Python's best practices.

Also applies to: 43-43

graphrag_sdk/chat_session.py (2)

28-30: LGTM! Well-structured parameter additions.

The new optional parameters are well-named and properly typed, maintaining backward compatibility while adding configurability.


28-30: Verify consistent parameter usage across the codebase.

The new parameters should be consistently handled in related classes and methods.

✅ Verification successful

Let me gather more specific information about the parameter usage.


Parameters are correctly propagated and used throughout the codebase

The verification shows that:

  • The new parameters are properly handled in KnowledgeGraph.chat_session() method which correctly forwards them to ChatSession
  • ChatSession has proper fallback logic for cypher_system_instruction and qa_system_instruction
  • The parameters are used in the expected way with GraphQueryGenerationStep and QAStep classes
  • No inconsistencies found in parameter usage across different model implementations (Gemini, OpenAI, Azure, Ollama)
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for related class instantiations and verify parameter handling
echo "Searching for ChatSession instantiations..."
rg -l "ChatSession\(" 

echo "\nSearching for related step classes..."
rg -l "class.*Step"

echo "\nChecking GraphQueryGenerationStep and QAStep parameter handling..."
ast-grep --pattern 'class GraphQueryGenerationStep {
  $$$
  def __init__($$$) {
    $$$
  }
  $$$
}'

ast-grep --pattern 'class QAStep {
  $$$
  def __init__($$$) {
    $$$
  }
  $$$
}'

Length of output: 1564


Script:

#!/bin/bash
# Check ChatSession instantiations and their parameters
echo "Checking ChatSession instantiations..."
rg "ChatSession\(" -A 5

echo "\nChecking GraphQueryGenerationStep implementation..."
rg "class GraphQueryGenerationStep" -A 10 graphrag_sdk/steps/graph_query_step.py

echo "\nChecking QAStep implementation..."
rg "class QAStep" -A 10 graphrag_sdk/steps/qa_step.py

echo "\nChecking where cypher_system_instruction and qa_system_instruction are used..."
rg "cypher_system_instruction|qa_system_instruction"

Length of output: 6099

Comment on lines +51 to +62
if self.cypher_prompt is not None:
cypher_prompt = self.cypher_prompt
else:
cypher_prompt = (
(CYPHER_GEN_PROMPT.format(question=question)
if self.last_answer is None
else CYPHER_GEN_PROMPT_WITH_HISTORY.format(question=question, last_answer=self.last_answer))
if error is False
else CYPHER_GEN_PROMPT_WITH_ERROR.format(
question=question, error=error
)
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Refactor nested ternary expressions for better readability.

The current implementation uses nested ternary expressions which can be hard to read and maintain. Consider extracting the logic into a separate method for better clarity.

Here's a suggested refactor:

-                if self.cypher_prompt is not None:
-                    cypher_prompt = self.cypher_prompt
-                else:
-                    cypher_prompt = (
-                        (CYPHER_GEN_PROMPT.format(question=question) 
-                        if self.last_answer is None
-                        else CYPHER_GEN_PROMPT_WITH_HISTORY.format(question=question, last_answer=self.last_answer))
-                        if error is False
-                        else CYPHER_GEN_PROMPT_WITH_ERROR.format(
-                            question=question, error=error
-                        )
-                    )   
+                cypher_prompt = self._get_cypher_prompt(question, error)
+
+    def _get_cypher_prompt(self, question: str, error: Exception | bool) -> str:
+        if self.cypher_prompt is not None:
+            return self.cypher_prompt
+        
+        if error:
+            return CYPHER_GEN_PROMPT_WITH_ERROR.format(
+                question=question, 
+                error=error
+            )
+        
+        if self.last_answer is None:
+            return CYPHER_GEN_PROMPT.format(question=question)
+        
+        return CYPHER_GEN_PROMPT_WITH_HISTORY.format(
+            question=question,
+            last_answer=self.last_answer
+        )

Committable suggestion skipped: line range outside the PR's diff.

Comment on lines +49 to +52
if cypher_system_instruction is None:
cypher_system_instruction = CYPHER_GEN_SYSTEM.replace("#ONTOLOGY", str(ontology.to_json()))
else:
cypher_system_instruction = cypher_system_instruction + "\nOntology:\n" + str(ontology.to_json())
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Standardize ontology injection method.

The code uses two different approaches to inject the ontology:

  1. Using replace("#ONTOLOGY", str(ontology.to_json())) for the default case
  2. Using string concatenation with "\nOntology:\n" for custom instructions

This inconsistency could lead to formatting differences and maintenance issues.

Consider standardizing the approach:

-        if cypher_system_instruction is None:
-            cypher_system_instruction = CYPHER_GEN_SYSTEM.replace("#ONTOLOGY", str(ontology.to_json()))
-        else:
-            cypher_system_instruction = cypher_system_instruction + "\nOntology:\n" + str(ontology.to_json())
+        base_instruction = cypher_system_instruction or CYPHER_GEN_SYSTEM
+        ontology_json = str(ontology.to_json())
+        cypher_system_instruction = base_instruction.replace("#ONTOLOGY", ontology_json)

Committable suggestion skipped: line range outside the PR's diff.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant