-
Notifications
You must be signed in to change notification settings - Fork 23
Implemented OpenRouter API key usage with any model, set Gemini 2.5 Flash default, created natural-language agent to parse chat. #33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 18 commits
8b1234c
7e172f4
87cc262
19252db
ae7eab3
be0035a
258f5c8
9cc899f
127614a
bb11ad5
e8828c9
e230fb7
4c4016e
6c03c1a
393adce
d1f10cd
677d9c6
a256890
6033dea
e0781d1
a8d6f22
11d1441
6d52d0f
19bcd22
50cd78a
2707f8e
7720040
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
| 3.13.3 |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,129 @@ | ||
| import json | ||
|
|
||
| from utils.chat_state import ChatState | ||
| from utils.helpers import command_ids | ||
| from langchain.prompts import PromptTemplate | ||
| from components.llm import get_llm, get_prompt_llm_chain | ||
| from utils.query_parsing import parse_query | ||
|
|
||
| # Create prompt to generate commands from unstructrured user input | ||
| prompt =""" | ||
| # MISSION | ||
| You are an advanced AI assistant that determines the correct DocDocGo command to use given a user's query. DocDocGo is an AI app that assists with research and uses RAG by storing research in "collections", allowing it to combine insight from all information in a collection and use an LLM to generate answers based on the entire collection. It can also answer questions about its own functioning. | ||
|
|
||
| # INPUT | ||
| You will be provided with a query from the user and the current collection the user has selected. | ||
|
|
||
| # HIGH LEVEL TASK | ||
| You don't need to answer the query. Instead, your goal is to determine which of the following commands to prepend to the query: | ||
|
|
||
| ## KB (COLLECTION) COMMANDS | ||
| - /kb <query>: chat using the current collection as a knowledge base. If the query is relevant to the currently selected collection, use this one. | ||
| - /ingest: upload your documents and ingest them into a collection | ||
| - /ingest <url>: retrieve a URL and ingest into a collection | ||
| - /summarize <url>: retrieve a URL, summarize and ingest into a collection | ||
| - /db list: list all your collections | ||
| - /db list <str>: list your collections whose names contain <str> | ||
| - /db use <str>: switch to the collection named <str> | ||
| - /db rename <str>: rename the current collection to <str> | ||
| - /db delete <str>: delete the collection named <str> | ||
| - /db status: show your access level for the current collection and related info | ||
| - /db: show database management options | ||
| - /share: share your collection with others | ||
| - /details <query>: get details about the retrieved documents | ||
| - /quotes <query>: get quotes from the retrieved documents | ||
|
|
||
| ## MAIN RESEARCH COMMANDS | ||
| - /research <query>: do "classic" research - ingest websites into a new collection, write a report. If the query seems to be novel and the user specifically asks for research with a fairly in-depth response, use this one. This will ingest the results into a new collection. Use /research ONLY when the query requires an in-depth report. Otherwise for more typical questions, use /research heatseek. | ||
| - /research iterate <int>: fetch more websites and iterate on the previous report <int> times. The number of times is optional. If the user wants you to continue researching the topic, or if the user uses the keyword "iterate", use this command. If they specify a number of times to run a deeper or combine search, append the integer to the query. | ||
| - /research heatseek <query>: do "heatseek" research - find websites that contain the answer and select one specific site that has exactly what is requested. This command does not use the selected collection. If the user knows about heatseek, they might specify it by name and specify the number of "rounds" of heatseek research, in which case you should output "/research <query> <int>" with "int" being the number. | ||
|
|
||
| ## ADDITIONAL RESEARCH COMMANDS | ||
| - /research set-query <query>: change the research query. If the user asks a new question that is similar to the previous question, suggest this command. | ||
| - /research set-report-type <new report type>: instructions for the desired report format. Some examples are: | ||
| Detailed Report: A comprehensive overview that includes in-depth information and analysis. | ||
| Summary Report: A concise summary of findings, highlighting key points and conclusions. | ||
| Numbered List: A structured list format that presents information in a numbered sequence. | ||
| Bullet Points: A format that uses bullet points for easy readability and quick reference. | ||
| Table Format: A structured format that organizes data into rows and columns for clarity. | ||
| - /research set-search-queries: perform web searches with new queries and queue up resulting links | ||
| - /research clear: remove all reports but keep ingested content | ||
| - /research startover: perform /research clear, then rewrite the initial report | ||
|
|
||
| IMPORTANT: There are two kinds of research, classic and heatseek. If the user is looking for in-depth research on their query use /research. If they are looking for a targeted, specific answer to a relatively narrow question, use /research heatseek. | ||
|
|
||
| ## OTHER COMMANDS | ||
| - /web <your query>: perform web searches and generate a report without ingesting into a collection | ||
| - /chat <your query>: regular chat, without retrieving docs or websites (Use this only when you can answer fully based on your internal knowledge or conversation history.) | ||
| - /export: export your data | ||
| - /help <your query>: get help with using DocDocGo | ||
|
|
||
| ## GUIDELINE REGARDING COMMANDS | ||
| - Only use /chat if you do not need to fetch external information to fully answer. Otherwise use /research for in-depth, new research, /kb for queries about the current collection, and /research heatseek for typical queries. | ||
|
|
||
| # THE CURRENT COLLECTION | ||
| Here is a report on the contents of the current collection so you can decide which command to use: | ||
| {details} | ||
| IMPORTANT: If the user's question cannot be answered using the current knowledge base, select a command like "/research" that creates a new collection. | ||
|
|
||
| # OUTPUT | ||
| You will output 2 strings in a JSON format: The first is an answer to the user's query, informing them what effects the command you choose will have without making reference to the command itself. Your second string will output the raw string of the suggested query, ready to be run. | ||
|
|
||
| ## EXAMPLES OF OUTPUT | ||
|
|
||
| query: 'What are some common birds I might see around Berkeley, California, and how can I identify them?' | ||
| output: {{'answer': 'It looks like this is a different topic than your current collection. I will do some research and create a new collection to store the information.', 'command': '/research What are some common birds I might see around Berkeley, California, and how can I identify them?'}} | ||
|
|
||
| query: 'What are some common birds I might see around Berkeley, California, and how can I identify them?' | ||
| output: {{'answer': 'This is relevant to your current collection, so I will look through what we have already for the answer.', 'command': '/kb What are some common birds I might see around Berkeley, California, and how can I identify them?'}} | ||
|
|
||
| query: 'There's a small, grayish-brown bird outside my window that is round with a little crest on its head. It is very lively and cute. It is about 4 inches tall. What kind of bird could it be?' | ||
| output: {{'answer': 'This is a very specific question so I will do targeted research to find the answer on the web. I won't ingest the results in any of your collections.', 'command': '/research heatseek 3 here's a small, grayish-brown bird outside my window that is round with a little crest on its head. It is very lively and cute. It is about 4 inches tall. What kind of bird could it be?'}} | ||
|
|
||
| query: 'What can I do to help with conservation efforts for Bay Area birds? I asked before but I want more in-depth results.' | ||
| output: {{'answer': 'I will do deeper research on this topic', 'command': '/research iterate 3'}} | ||
| (Note to LLM: Please don't use /research iterate if the current research query does not exactly match this one in meaning) | ||
|
|
||
| query: 'I want to summarize and add this website to my collection: https://www.inaturalist.org/guides/732' | ||
| output: {{'answer': 'I'll create a report for this URL and add it into your collection.", 'command': '/summarize https://www.inaturalist.org/guides/732'}} | ||
|
|
||
| query: 'What is the happiness index for Norway?' | ||
| output: {{'answer': 'I will do targeted research and find the exact answer for this question.', 'command': '/research heatseek What is the happiness index for Norway?'}} | ||
|
|
||
| query: 'What's it like being an AI?' | ||
| output: {{'answer': 'Hmm, let me think about that.', 'command': '/chat What's it like being an AI?'}} | ||
|
|
||
| ## YOUR ACTUAL OUTPUT | ||
|
|
||
| query: {query} | ||
| output: Use the information provided above to construct the output requested, in double curly braces with an "answer" and "command" element separated by a comma, in proper JSON. | ||
| """ | ||
|
|
||
| def get_raw_command(query: str, chat_state: ChatState): | ||
| prompt_template = PromptTemplate.from_template(prompt) | ||
| coll_summary_query = {} | ||
|
|
||
| # Get details on the current collection | ||
| print("Getting details on", chat_state.collection_name) | ||
| if chat_state.collection_name not in coll_summary_query: | ||
| coll_summary_query[chat_state.collection_name] = "" | ||
| summary_prompt = "/kb Can you summarize in one sentence the contents of the current collection?" | ||
| summary_llm = get_llm(chat_state.bot_settings, chat_state,chat_state.openrouter_api_key,embeddings_needed=False) | ||
|
||
| response = summary_llm.invoke(summary_prompt) | ||
| coll_summary_query[chat_state.collection_name] = str(response) | ||
|
|
||
| # Check if query already starts with a command string, if so return as is | ||
| if any(chat_state.message.startswith(command + "") for command in command_ids): | ||
| return chat_state.message | ||
| # If not formatted as a command, prompt LLM to generate and return a JSON-formatted command | ||
| else: | ||
| chain = get_prompt_llm_chain( | ||
| prompt=prompt_template, | ||
| chat_state=chat_state, | ||
| llm_settings=chat_state.bot_settings, | ||
| embeddings_needed=False) | ||
| json_response = chain.invoke({"details": coll_summary_query[chat_state.collection_name], "query": query}).strip("`json") | ||
| dict_response = json.loads(json_response) | ||
| return dict_response | ||
|
|
||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
coll_summary_queryis initialized as empty above. More importantly, why waste time and money summarizing when we haven't yet checked if the query already starts with a command string?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should only check for the summary once for each collection and store it in the coll_summary_query dictionary. Whether it starts with a command string or not, each collection should be summarized at least once.
I have modified the code so it only initializes coll_summary_query if it doesn't exist here: 11d1441