Refactor + Add support for /v1/chat/completions by dvcrn · Pull Request #2 · dvcrn/gemini-code-assist-proxy

dvcrn · 2025-10-08T09:49:50Z

No description provided.

Introduce StreamGenerateContent to forward raw SSE lines to a provided channel. Handle 401 by refreshing the token and retrying. Set Accept: text/event-stream and increase scanner buffer for large events. The function does not close the output channel.

- Change StreamGenerateContent to take context and use NewRequestWithContext; set Accept to text/event-stream and log scanner errors - Add SSE keepalives and client cancel handling in stream handler - Parse request as GeminiInternalRequest; introduce typed content, tools, and generation config structs - Add /v1/chat/completions route - Update deps: uuid, testify, and indirect test utilities

- Implement SSE handler bridging to Gemini and normalizing models - Add OpenAI types for chat, messages, and tools - Convert OpenAI messages to Gemini contents and config - Stream CloudCode SSE through OpenAI-compatible transformer - Add unit tests for OpenAI→Gemini transform

- Add FunctionCall and FunctionResponse to ContentPart - Convert OpenAI tools to Gemini tools in request transform - Log detailed request, model normalization, and streaming events - Emit tool call chunks for function calls in SSE stream

Resolve tool response names via tool_call_id or explicit name. Map assistant tool_calls to functionCall and tool outputs to functionResponse. Extend openai.Message with ToolCalls, ToolCallID, and Name. Fix INVALID_ARGUMENT by ensuring functionResponse.name is always set. Add test covering tool response name resolution.

- Implement non-streaming /v1/chat/completions with OpenAI-style JSON - Add /v1/models endpoint returning Gemini model metadata - Refactor chatCompletionsHandler to support both streaming and non-streaming - Add Gemini-to-OpenAI response transformer - Improve logging for tool calls and responses

Add test to ensure tool_calls and tool responses are matched and aggregated into a single user turn, satisfying CloudCode's function call/response parity requirement.

gemini-code-assist · 2025-10-08T09:50:15Z

Summary of Changes

Hello @dvcrn, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces comprehensive support for the OpenAI /v1/chat/completions API endpoint, enabling OpenAI-compatible clients to seamlessly integrate with the Gemini Code Assist API. Key changes include the addition of new packages for OpenAI types and transformations, refactoring of the core Gemini client with token refresh logic, and a restructured server to handle new routes and delegate requests. The build system was also updated from just to mise, and documentation was revised accordingly.

Highlights

Intent: This pull request primarily aims to add support for the OpenAI /v1/chat/completions endpoint, enabling OpenAI-compatible clients to interact with the Gemini Code Assist API through this proxy. It also includes significant refactoring to improve code structure and maintainability.
Key Changes: The core changes involve introducing a new internal/openai package to define OpenAI API types and handle streaming transformations from Gemini's SSE format to OpenAI's SSE format. A new internal/transform package was added to manage the conversion logic between OpenAI and Gemini request/response structures, including complex mapping for messages, system instructions, and tool calls/responses. The internal/gemini client was enhanced with GenerateContent and StreamGenerateContent methods, incorporating automatic token refresh. The server now includes dedicated handlers for /v1/chat/completions (streaming and non-streaming) and /v1/models (providing a static list of supported models). The build system was migrated from just to mise, and documentation (README.md, CLAUDE.md) was updated to reflect these changes and new API support.
Refactoring: The HTTP client implementation was moved from internal/server to a new internal/http package for better modularity. The main server routes were reorganized into dedicated handler files (chatCompletionsHandler.go, modelsHandler.go, streamGenerateContentHandler.go) to separate concerns and improve readability.
New Features: Full support for the OpenAI /v1/chat/completions endpoint, including streaming and non-streaming requests, message role mapping (user, assistant, tool), system instructions, and tool call/response handling. A /v1/models endpoint was added to list available Gemini models in an OpenAI-compatible format.
Documentation and Build: Updated README.md and CLAUDE.md to reflect the new OpenAI API support, installation instructions, and migration from just to mise for task management. New mise.toml file replaces the old justfile.
Impact: These changes significantly expand the proxy's compatibility, allowing a wider range of AI clients (those built for OpenAI's API) to leverage the Gemini Code Assist backend. The refactoring improves the maintainability and extensibility of the codebase.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces significant and valuable changes, including a major refactor and the addition of an OpenAI-compatible endpoint. The new features are well-implemented with thorough testing.

I've identified a few key areas for improvement:

Critical Bug: There's a bug in the token refresh retry logic that will cause requests to fail after a token expires. I've provided a detailed comment and suggestion to fix this.
Code Duplication: The token refresh logic is duplicated in several places. Centralizing this would improve maintainability and prevent bugs like the one mentioned above.
Documentation: The README.md has been simplified, but in the process, some very useful documentation about the proxy's transformation logic was removed. I'd recommend restoring it or moving it to a separate file.

Overall, this is a great contribution. Addressing the critical issues will make the proxy much more robust.

gemini-code-assist · 2025-10-08T09:52:43Z

+	// Check for 401 Unauthorized and attempt a token refresh
+	if resp.StatusCode == http.StatusUnauthorized {
+		resp.Body.Close() // Close the first response body
+
+		if err := c.provider.RefreshToken(); err != nil {
+			return nil, fmt.Errorf("failed to refresh token: %w", err)
+		}
+
+		// Reload credentials after refresh
+		refreshedCreds, err := c.provider.GetCredentials()
+		if err != nil {
+			return nil, fmt.Errorf("failed to reload credentials after refresh: %w", err)
+		}
+
+		// Re-create the request with the new token
+		req.Header.Set("Authorization", "Bearer "+refreshedCreds.AccessToken)
+
+		// Retry the request
+		resp, err = c.httpClient.Do(req)
+		if err != nil {
+			return nil, fmt.Errorf("request execution error after refresh: %w", err)
+		}
+	}


This token refresh and retry logic is duplicated in LoadCodeAssist, GenerateContent, and StreamGenerateContent. This code duplication is problematic for a few reasons:

Maintainability: Any changes to this logic must be manually applied in multiple places, which is error-prone.

Bugs: It has led to a critical bug in this implementation.

Critical Bug
The request body is not being reset before the retry. After the first c.httpClient.Do(req) call, req.Body has been read and is at EOF. The retry will therefore send an empty request body, causing the upstream API call to fail silently or with a validation error. The implementation in internal/project/discover.go correctly handles this by recreating the body reader.

Refactoring Suggestion
To fix both the bug and the duplication, this logic should be centralized. The most idiomatic Go approach would be to create a custom http.RoundTripper that wraps the default transport. This RoundTripper would handle the 401 response, token refresh, and retry transparently for all HTTP calls made through the client.

Immediate Fix
If a full refactor is out of scope, you should at least extract this logic into a private helper method on *Client and fix the body reset issue. The suggestion below fixes the bug for this function. Please apply a similar fix to GenerateContent and StreamGenerateContent (noting the request variable is httpReq in those functions).

// Check for 401 Unauthorized and attempt a token refresh if resp.StatusCode == http.StatusUnauthorized { resp.Body.Close() // Close the first response body if err := c.provider.RefreshToken(); err != nil { return nil, fmt.Errorf("failed to refresh token: %w", err) } // Reload credentials after refresh refreshedCreds, err := c.provider.GetCredentials() if err != nil { return nil, fmt.Errorf("failed to reload credentials after refresh: %w", err) } // Re-create the request with the new token and reset the body req.Header.Set("Authorization", "Bearer "+refreshedCreds.AccessToken) // CRITICAL: The request body must be reset for the retry. req.Body = ioutil.NopCloser(bytes.NewReader(bodyBytes)) // Retry the request resp, err = c.httpClient.Do(req) if err != nil { return nil, fmt.Errorf("request execution error after refresh: %w", err) } }

gemini-code-assist · 2025-10-08T09:52:43Z

+
+// StreamGenerateContent performs a streaming request and sends each raw SSE line to the provided channel.
+// It does not transform or interpret SSE content; lines are forwarded as-is.
+// The caller owns the lifecycle of the 'out' channel; this function will not close it.


This comment is dangerously incorrect and contradicts the implementation. The comment states that the caller owns the channel lifecycle, but the function's implementation at line 258 (defer close(out)) correctly closes the channel.

If a caller follows this comment and also tries to close the channel, it will cause a panic (double close). The function should be responsible for closing the channel to signal to the receiver that the stream is complete. The implementation is correct, but the comment must be fixed to prevent misuse.

Suggested change

// The caller owns the lifecycle of the 'out' channel; this function will not close it.

// This function is responsible for closing the 'out' channel upon completion.

gemini-code-assist · 2025-10-08T09:52:43Z

- `gemini-1.5-flash` → `gemini-2.5-flash`
- `gemini-pro-latest` → `gemini-2.5-pro`
-
-### 3. Request Body Transformation


The refactoring of the README has removed several sections that were very helpful for understanding how the proxy works, specifically the Request Body Transformation, Response Transformation, and Usage Examples sections.

While the new README is more concise, this information is crucial for users and contributors to understand the proxy's core functionality. Without it, it's much harder to know how to structure requests or what to expect in responses.

Suggestion:
Please consider re-adding these sections. If the goal is to keep the main README.md lean, this detailed information could be moved to a separate file like ARCHITECTURE.md or DETAILS.md and linked from here.

gemini-code-assist · 2025-10-08T09:52:43Z

+	"testing"
+)
+
+func TestCreateOpenAIStreamTransformer_BasicText(t *testing.T) {


For consistency with other new tests in this PR (e.g., openai_to_gemini_test.go), it would be better to use the testify library for assertions (assert and require). This makes tests more readable and provides better failure messages.

gemini-code-assist · 2025-10-08T09:52:43Z

+	openAIResp := map[string]interface{}{
+		"id":      "chatcmpl",
+		"object":  "chat.completion",
+		"created": created,
+		"model":   req.Model,
+		"choices": []map[string]interface{}{
+			{
+				"index": 0,
+				"message": map[string]interface{}{
+					"role":    "assistant",
+					"content": contentText,
+				},
+				"finish_reason": "stop",
+			},
+		},
+	}


Instead of manually constructing the response as a map[string]interface{}, it's better to use the openai.ChatCompletionResponse struct you've already defined. This provides type safety, makes the code cleaner, and reduces the chance of typos in JSON keys.

Additionally, the current implementation uses a hardcoded, non-unique id ("chatcmpl"). OpenAI API responses have unique IDs. You should generate a unique ID for each response, for example using the uuid package which is already a dependency in this project.

Note: You will need to add fmt and github.com/google/uuid to the imports for this file.

Suggested change

openAIResp := map[string]interface{}{

"id": "chatcmpl",

"object": "chat.completion",

"created": created,

"model": req.Model,

"choices": []map[string]interface{}{

{

"index": 0,

"message": map[string]interface{}{

"role": "assistant",

"content": contentText,

},

"finish_reason": "stop",

},

},

}

openAIResp := openai.ChatCompletionResponse{

ID: fmt.Sprintf("chatcmpl-%s", uuid.New().String()),

Object: "chat.completion",

Created: created,

Model: req.Model,

Choices: []openai.Choice{

{

Index: 0,

Message: openai.Message{

Role: "assistant",

Content: contentText,

},

FinishReason: "stop",

},

},

}

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

chatgpt-codex-connector · 2025-10-08T09:54:04Z

+		// Re-create the request with the new token
+		httpReq.Header.Set("Authorization", "Bearer "+refreshedCreds.AccessToken)
+
+		// Retry the request
+		resp, err = c.httpClient.Do(httpReq)


Retry after 401 reuses consumed request body

When the first generateContent call returns 401, the code updates the Authorization header and retries the same httpReq. However the request was constructed with a bytes.NewReader and after the first Do the body has been consumed and closed, so the second Do will either send an empty payload or fail with http: Request Body errors. Any expired token will therefore cause non‑streaming completions to fail even after refresh. Rebuild the http.Request (or reset the body via GetBody) before retrying.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2025-10-08T09:54:04Z

+		// Re-create the request with the new token
+		httpReq.Header.Set("Authorization", "Bearer "+refreshedCreds.AccessToken)
+
+		// Retry the request
+		resp, err = c.httpClient.Do(httpReq)


Streaming retry drops request payload after token refresh

The streaming path handles a 401 by refreshing the token and reusing the same httpReq. Because the request body was already read during the first Do, the retry sends an empty body and the upstream SSE request fails, meaning token refresh does not work for streaming calls. Similar to the non‑streaming method, the retry must rebuild the request or reset its body before issuing the second Do.

Useful? React with 👍 / 👎.

dvcrn added 17 commits September 30, 2025 12:23

Add auto-reauth

ef7a10f

Add Gemini API route for content generation

acb4e71

Add Gemini client and improve generate content handling

70ffda5

Add SSE stream handler and update client request

3f09128

Normalize model name before starting upstream

0da1c3e

Update README to include new OpenAI reference

e383ac7

Fix duplicate system messages

78034de

Improve handling of tool calls

3ad789c

Replace just with mise in development docs and remove justfile

9b97efc

Add mise.toml with tasks for build, run, test, and format

cbec7ed

Aggregate tool responses into single user turn for parity

050ca15

Add test to ensure tool_calls and tool responses are matched and aggregated into a single user turn, satisfying CloudCode's function call/response parity requirement.

dvcrn merged commit ad55c5d into main Oct 8, 2025
1 check failed

gemini-code-assist bot reviewed Oct 8, 2025

View reviewed changes

chatgpt-codex-connector bot reviewed Oct 8, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor + Add support for /v1/chat/completions#2

Refactor + Add support for /v1/chat/completions#2
dvcrn merged 17 commits intomainfrom
refactor

dvcrn commented Oct 8, 2025

Uh oh!

gemini-code-assist bot commented Oct 8, 2025

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Oct 8, 2025

Uh oh!

gemini-code-assist bot Oct 8, 2025

Uh oh!

gemini-code-assist bot Oct 8, 2025

Uh oh!

gemini-code-assist bot Oct 8, 2025

Uh oh!

gemini-code-assist bot Oct 8, 2025

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Oct 8, 2025

Uh oh!

chatgpt-codex-connector bot Oct 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	// The caller owns the lifecycle of the 'out' channel; this function will not close it.
	// This function is responsible for closing the 'out' channel upon completion.

Conversation

dvcrn commented Oct 8, 2025

Uh oh!

gemini-code-assist bot commented Oct 8, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant