-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Implemented a way for Google Model to analyze JSON file links using DocumentUrl #3269
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 2 commits
e0adb97
c319843
e252f8c
76acdc4
8ec0aaa
24828d7
0d8549e
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -778,6 +778,16 @@ async def test_google_model_text_document_url_input(allow_model_requests: None, | |
| 'The main content of the TXT file is an explanation of the placeholder name "John Doe" (and related variations) and its usage in legal contexts, popular culture, and other situations where the identity of a person is unknown or needs to be withheld. The document also includes the purpose of the file and other file type information.\n' | ||
| ) | ||
|
|
||
| async def test_google_model_json_document_url_input(allow_model_requests: None, google_provider: GoogleProvider): | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The VCR cassette will be generated automatically when you call |
||
| m = GoogleModel('gemini-2.5-pro', provider=google_provider) | ||
| agent = Agent(m, system_prompt='You are a helpful chatbot.') | ||
|
|
||
| json_document_url = DocumentUrl(url='https://kamalscraping-collab.github.io/sample-data/sample_transcript.json') | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we please use a different public JSON file that's not dependent on your repo? |
||
|
|
||
| result = await agent.run(['What is the main content of this document?', json_document_url]) | ||
| assert result.output == snapshot( | ||
| "Based on the JSON data provided, the document contains the log of a conversation between a user and an AI assistant.\n" | ||
| ) | ||
|
|
||
| async def test_google_model_text_as_binary_content_input(allow_model_requests: None, google_provider: GoogleProvider): | ||
| m = GoogleModel('gemini-2.0-flash', provider=google_provider) | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need tests for this behavior like we have in
test_openai.pyThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check the function
test_google_model_json_document_url_inputintest_google.py. That should workThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to use the same
_inline_text_file_partwe use in OpenAI, so that the text is properly formatted as representing a file.I suggest moving it to a method on
BinaryContentthat returns the text with the fencing._is_text_like_media_typecan become a method onBinaryContentandDocumentUrlas well.When we check
isinstance(item, DocumentUrl)and then dodownloaded_text = await download_item(item, data_format='text'), we can create aBinaryContentfrom the result ofdownload_item, and the call the newinline_text_filemethod on it.