-
Notifications
You must be signed in to change notification settings - Fork 23
Implemented OpenRouter API key usage with any model, set Gemini 2.5 Flash default, created natural-language agent to parse chat. #33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
rio-codes
wants to merge
27
commits into
reasonmethis:main
Choose a base branch
from
rio-codes:dev-command-chooser
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
27 commits
Select commit
Hold shift + click to select a range
8b1234c
fix typo in env variable in dev docs
rio-codes 7e172f4
Merge branch 'main' of github.com:rio-codes/docdocgo-core
rio-codes 87cc262
Merge branch 'main' of github.com:rio-codes/docdocgo-core
rio-codes 19252db
updating gitignore
rio-codes ae7eab3
changing all references to OpenAI API key to OpenRouter
rio-codes be0035a
openai is required for embeddings, re-implementing
rio-codes 258f5c8
more reversions to openai where embedding is needed
rio-codes 9cc899f
reverting announcement since it is for old version
rio-codes 127614a
changed streamlit UI to include OpenRouter
rio-codes bb11ad5
fixing openrouter model setting
rio-codes e8828c9
model name is now obtained from settings
rio-codes e230fb7
various changes and beginning of new function code
rio-codes 4c4016e
uploading progress but still troubleshooting
rio-codes 6c03c1a
more fixes to command chooser, OpenRouter migration
rio-codes 393adce
small formatting modifications, change to env example
rio-codes d1f10cd
added default mode, fixed callbacks bug, adjusted prompt
rio-codes 677d9c6
implementing new default chat mode, cached summaries, and other fixes
rio-codes a256890
quick commit of file that should have been saved
rio-codes 6033dea
quick commit of file that should have been saved
rio-codes e0781d1
removed all references to embeddings_needed
rio-codes a8d6f22
Merge branch 'dev-command-chooser' of github.com:rio-codes/docdocgo-c…
rio-codes 11d1441
Only initialize coll_summary_query as blank if it doesn't exist
rio-codes 6d52d0f
Update components/chroma_ddg_retriever.py
rio-codes 19bcd22
removed unnecessary code that would not be reached if no chunks were …
rio-codes 50cd78a
removed unneccesary pwd check
rio-codes 2707f8e
simplifying logic to collapse API key fields if LLM response
rio-codes 7720040
changed all instances of DEFAULT_CHAT_COMMAND_ID to AUTO_COMMAND_ID
rio-codes File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
| 3.13.3 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,130 @@ | ||
| import json | ||
|
|
||
| from utils.chat_state import ChatState | ||
| from utils.helpers import command_ids | ||
| from langchain.prompts import PromptTemplate | ||
| from components.llm import get_llm, get_prompt_llm_chain | ||
| from utils.query_parsing import parse_query | ||
|
|
||
| # Create prompt to generate commands from unstructrured user input | ||
| prompt =""" | ||
| # MISSION | ||
| You are an advanced AI assistant that determines the correct DocDocGo command to use given a user's query. DocDocGo is an AI app that assists with research and uses RAG by storing research in "collections", allowing it to combine insight from all information in a collection and use an LLM to generate answers based on the entire collection. It can also answer questions about its own functioning. | ||
|
|
||
| # INPUT | ||
| You will be provided with a query from the user and the current collection the user has selected. | ||
|
|
||
| # HIGH LEVEL TASK | ||
| You don't need to answer the query. Instead, your goal is to determine which of the following commands to prepend to the query: | ||
|
|
||
| ## KB (COLLECTION) COMMANDS | ||
| - /kb <query>: chat using the current collection as a knowledge base. If the query is relevant to the currently selected collection, use this one. | ||
| - /ingest: upload your documents and ingest them into a collection | ||
| - /ingest <url>: retrieve a URL and ingest into a collection | ||
| - /summarize <url>: retrieve a URL, summarize and ingest into a collection | ||
| - /db list: list all your collections | ||
| - /db list <str>: list your collections whose names contain <str> | ||
| - /db use <str>: switch to the collection named <str> | ||
| - /db rename <str>: rename the current collection to <str> | ||
| - /db delete <str>: delete the collection named <str> | ||
| - /db status: show your access level for the current collection and related info | ||
| - /db: show database management options | ||
| - /share: share your collection with others | ||
| - /details <query>: get details about the retrieved documents | ||
| - /quotes <query>: get quotes from the retrieved documents | ||
|
|
||
| ## MAIN RESEARCH COMMANDS | ||
| - /research <query>: do "classic" research - ingest websites into a new collection, write a report. If the query seems to be novel and the user specifically asks for research with a fairly in-depth response, use this one. This will ingest the results into a new collection. Use /research ONLY when the query requires an in-depth report. Otherwise for more typical questions, use /research heatseek. | ||
| - /research iterate <int>: fetch more websites and iterate on the previous report <int> times. The number of times is optional. If the user wants you to continue researching the topic, or if the user uses the keyword "iterate", use this command. If they specify a number of times to run a deeper or combine search, append the integer to the query. | ||
| - /research heatseek <query>: do "heatseek" research - find websites that contain the answer and select one specific site that has exactly what is requested. This command does not use the selected collection. If the user knows about heatseek, they might specify it by name and specify the number of "rounds" of heatseek research, in which case you should output "/research <query> <int>" with "int" being the number. | ||
|
|
||
| ## ADDITIONAL RESEARCH COMMANDS | ||
| - /research set-query <query>: change the research query. If the user asks a new question that is similar to the previous question, suggest this command. | ||
| - /research set-report-type <new report type>: instructions for the desired report format. Some examples are: | ||
| Detailed Report: A comprehensive overview that includes in-depth information and analysis. | ||
| Summary Report: A concise summary of findings, highlighting key points and conclusions. | ||
| Numbered List: A structured list format that presents information in a numbered sequence. | ||
| Bullet Points: A format that uses bullet points for easy readability and quick reference. | ||
| Table Format: A structured format that organizes data into rows and columns for clarity. | ||
| - /research set-search-queries: perform web searches with new queries and queue up resulting links | ||
| - /research clear: remove all reports but keep ingested content | ||
| - /research startover: perform /research clear, then rewrite the initial report | ||
|
|
||
| IMPORTANT: There are two kinds of research, classic and heatseek. If the user is looking for in-depth research on their query use /research. If they are looking for a targeted, specific answer to a relatively narrow question, use /research heatseek. | ||
|
|
||
| ## OTHER COMMANDS | ||
| - /web <your query>: perform web searches and generate a report without ingesting into a collection | ||
| - /chat <your query>: regular chat, without retrieving docs or websites (Use this only when you can answer fully based on your internal knowledge or conversation history.) | ||
| - /export: export your data | ||
| - /help <your query>: get help with using DocDocGo | ||
|
|
||
| ## GUIDELINE REGARDING COMMANDS | ||
| - Only use /chat if you do not need to fetch external information to fully answer. Otherwise use /research for in-depth, new research, /kb for queries about the current collection, and /research heatseek for typical queries. | ||
|
|
||
| # THE CURRENT COLLECTION | ||
| Here is a report on the contents of the current collection so you can decide which command to use: | ||
| {details} | ||
| IMPORTANT: If the user's question cannot be answered using the current knowledge base, select a command like "/research" that creates a new collection. | ||
|
|
||
| # OUTPUT | ||
| You will output 2 strings in a JSON format: The first is an answer to the user's query, informing them what effects the command you choose will have without making reference to the command itself. Your second string will output the raw string of the suggested query, ready to be run. | ||
|
|
||
| ## EXAMPLES OF OUTPUT | ||
|
|
||
| query: 'What are some common birds I might see around Berkeley, California, and how can I identify them?' | ||
| output: {{'answer': 'It looks like this is a different topic than your current collection. I will do some research and create a new collection to store the information.', 'command': '/research What are some common birds I might see around Berkeley, California, and how can I identify them?'}} | ||
|
|
||
| query: 'What are some common birds I might see around Berkeley, California, and how can I identify them?' | ||
| output: {{'answer': 'This is relevant to your current collection, so I will look through what we have already for the answer.', 'command': '/kb What are some common birds I might see around Berkeley, California, and how can I identify them?'}} | ||
|
|
||
| query: 'There's a small, grayish-brown bird outside my window that is round with a little crest on its head. It is very lively and cute. It is about 4 inches tall. What kind of bird could it be?' | ||
| output: {{'answer': 'This is a very specific question so I will do targeted research to find the answer on the web. I won't ingest the results in any of your collections.', 'command': '/research heatseek 3 here's a small, grayish-brown bird outside my window that is round with a little crest on its head. It is very lively and cute. It is about 4 inches tall. What kind of bird could it be?'}} | ||
|
|
||
| query: 'What can I do to help with conservation efforts for Bay Area birds? I asked before but I want more in-depth results.' | ||
| output: {{'answer': 'I will do deeper research on this topic', 'command': '/research iterate 3'}} | ||
| (Note to LLM: Please don't use /research iterate if the current research query does not exactly match this one in meaning) | ||
|
|
||
| query: 'I want to summarize and add this website to my collection: https://www.inaturalist.org/guides/732' | ||
| output: {{'answer': 'I'll create a report for this URL and add it into your collection.", 'command': '/summarize https://www.inaturalist.org/guides/732'}} | ||
|
|
||
| query: 'What is the happiness index for Norway?' | ||
| output: {{'answer': 'I will do targeted research and find the exact answer for this question.', 'command': '/research heatseek What is the happiness index for Norway?'}} | ||
|
|
||
| query: 'What's it like being an AI?' | ||
| output: {{'answer': 'Hmm, let me think about that.', 'command': '/chat What's it like being an AI?'}} | ||
|
|
||
| ## YOUR ACTUAL OUTPUT | ||
|
|
||
| query: {query} | ||
| output: Use the information provided above to construct the output requested, in double curly braces with an "answer" and "command" element separated by a comma, in proper JSON. | ||
| """ | ||
|
|
||
| def get_raw_command(query: str, chat_state: ChatState): | ||
| prompt_template = PromptTemplate.from_template(prompt) | ||
| if not coll_summary_query: | ||
| coll_summary_query = {} | ||
|
|
||
| # Get details on the current collection | ||
| print("Getting details on", chat_state.collection_name) | ||
| if chat_state.collection_name not in coll_summary_query: | ||
| coll_summary_query[chat_state.collection_name] = "" | ||
| summary_prompt = "/kb Can you summarize in one sentence the contents of the current collection?" | ||
| summary_llm = get_llm(chat_state.bot_settings, chat_state,chat_state.openrouter_api_key) | ||
| response = summary_llm.invoke(summary_prompt) | ||
| coll_summary_query[chat_state.collection_name] = str(response) | ||
|
|
||
| # Check if query already starts with a command string, if so return as is | ||
| if any(chat_state.message.startswith(command + "") for command in command_ids): | ||
| return chat_state.message | ||
| # If not formatted as a command, prompt LLM to generate and return a JSON-formatted command | ||
| else: | ||
| chain = get_prompt_llm_chain( | ||
| prompt=prompt_template, | ||
| chat_state=chat_state, | ||
| llm_settings=chat_state.bot_settings | ||
| ) | ||
| json_response = chain.invoke({"details": coll_summary_query[chat_state.collection_name], "query": query}).strip("`json") | ||
| dict_response = json.loads(json_response) | ||
| return dict_response | ||
|
|
||
|
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
coll_summary_queryis initialized as empty above. More importantly, why waste time and money summarizing when we haven't yet checked if the query already starts with a command string?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should only check for the summary once for each collection and store it in the coll_summary_query dictionary. Whether it starts with a command string or not, each collection should be summarized at least once.
I have modified the code so it only initializes coll_summary_query if it doesn't exist here: 11d1441