-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Support Gemini client for Gemini API and Vertex AI #5524
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Yu Ishikawa <[email protected]>
Signed-off-by: Yu Ishikawa <[email protected]>
Signed-off-by: Yu Ishikawa <[email protected]>
Signed-off-by: Yu Ishikawa <[email protected]>
Signed-off-by: Yu Ishikawa <[email protected]>
Signed-off-by: Yu Ishikawa <[email protected]>
@microsoft-github-policy-service agree |
@yu-iskw does this client support multimodal output as well? |
@ekzhu Good point. I am seeking a better approach to support both of text generation and image generation with Gemini, since the method and its configurations of each are different. I appriciate if you could give good ideas to handle this. As far as I know, the API to text and image with OpenAI and Azure OpenAI is the same. So, it is unnecessary to use different APIs no matter which we want to generate text or image. I suppose it would be good to add a field whether or not a model supports image generation to autogen/python/packages/autogen-core/src/autogen_core/models/_model_client.py Lines 95 to 103 in e7a3c78
If there is no effective way at the moment and need to align the core component as [UPDATE] Sample CodeText Generationimport os
from google import genai
client = genai.Client(api_key=os.getenv("GEMINI_API_KEY"))
response = client.models.generate_content(
model="gemini-2.0-flash",
contents="Explain how AI works",
config=types.GenerateContentConfig(
temperature=0.5,
),
) Image Generationfrom io import BytesIO
from google import genai # type: ignore[import]
from google.genai import types # type: ignore[import]
from PIL import Image
client = genai.Client(vertexai=True, location="us-central1")
response = client.models.generate_images(
model="imagen-3.0-generate-002",
prompt="Fuzzy bunnies in my kitchen",
config=types.GenerateImagesConfig(
number_of_images=4,
),
)
for generated_image in response.generated_images:
image = Image.open(BytesIO(generated_image.image.image_bytes))
image.show() |
NOTE We can get information of model. If we use Gemini API, we can get information of from google import genai
import os
import json
from pprint import pprint
models = [
"gemini-1.5-flash",
"gemini-1.5-pro",
"gemini-2.0-flash",
"imagen-3.0-generate-002",
"text-embedding-004",
]
# Gemini API
print("==================== Gemini API ====================")
client = genai.Client(api_key=os.getenv("GEMINI_API_KEY"))
for model_name in models:
model = client.models.get(model=model_name)
pprint(f"{model_name}: {json.dumps(model.to_json_dict(), indent=2)}")
# Vertex AI
print("==================== Vertex AI ====================")
client = genai.Client(vertexai=True, location="us-central1")
for model_name in models:
model = client.models.get(model=model_name)
pprint(f"{model_name}: {json.dumps(model.to_json_dict(), indent=2)}") Model Information (Gemini API)
Model Information (Vertex AI)
|
@siscanu please provide your feedback to this PR here. |
Another thing to take into account is how to use tools provided by google in AutoGen together with other FunctionTools. Just for curiosity, do other client (like openAIchatcompletion) supports the use of tools provided by google or others ? Or users need to create a wrapper function ? |
First of all, we should enable users to use AutoGen tools with the client. On top of that, it would be good to discuss how to support tools of google-genai with this AutoGen Gemini client.
|
Signed-off-by: Yu Ishikawa <[email protected]>
I am still working on it. But, the direction of how to implement the client is getting clear. |
Why are these changes needed?
This pull request introduces support for Google’s Gemini API into the
autogen-ext
package. The changes include two new client implementations—GeminiChatCompletionClient
andVertexAIChatCompletionClient
—which enable users to interact with the Gemini models for advanced chat completions. The new clients support:Related issue number
#3741
Closes #5528
Checks