-
Notifications
You must be signed in to change notification settings - Fork 196
feat: Add Ollama integration support (fixes #64) #79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- Add Ollama provider support to LLM client with proper error handling - Update CLI to handle Ollama provider in create and curate commands - Fix Ollama server connection and generation handling - Add comprehensive Ollama integration tests - Update main config to use Ollama as default provider - Improve error messages and server availability checks This enables local LLM usage with Ollama for synthetic data generation, providing an alternative to API-based providers for privacy and cost savings.
- Add get_ollama_config() function to config utilities - Add comprehensive unit tests for Ollama provider functionality - Test Ollama initialization, chat completion, message formatting, and batch processing - Improve provider validation in config utilities These changes complete the Ollama integration test coverage and utilities.
|
Hi @aarunbhardwaj! Thank you for your pull request and welcome to our community. Action RequiredIn order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you. ProcessIn order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA. Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with If you have received this in error or have any questions, please contact us at [email protected]. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds comprehensive Ollama integration support to synthetic-data-kit, enabling users to run LLM operations entirely locally without API keys or external service dependencies. The implementation provides full feature parity with existing vLLM and api-endpoint providers.
Key Changes:
- Added Ollama as a third provider option alongside vLLM and api-endpoint
- Implemented Ollama-specific message formatting that converts OpenAI-style messages to Ollama's prompt format
- Added server availability checks and improved error handling for all provider types
Reviewed Changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| synthetic_data_kit/models/llm_client.py | Core Ollama integration with chat/batch completion methods and message formatting |
| synthetic_data_kit/utils/config.py | Added Ollama configuration function with default settings |
| synthetic_data_kit/cli.py | Extended CLI commands to support Ollama provider with server checks |
| synthetic_data_kit/config.yaml | Added Ollama configuration section with default values |
| tests/unit/test_llm_client.py | Unit tests for Ollama client initialization and operations |
| tests/integration/test_ollama_integration.py | Integration tests covering Ollama functionality and edge cases |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| except (requests.exceptions.RequestException, KeyError, ValueError) as e: | ||
| if verbose: | ||
| logger.error(f"Ollama API error (attempt {attempt+1}/{self.max_retries}): {e}") | ||
| if attempt == self.max_retries - 1: | ||
| raise Exception(f"Failed to get Ollama completion after {self.max_retries} attempts: {str(e)}") | ||
| time.sleep(self.retry_delay * (attempt + 1)) | ||
| except Exception as e: | ||
| if verbose: | ||
| logger.error(f"Ollama API error (attempt {attempt+1}/{self.max_retries}): {e}") | ||
| if attempt == self.max_retries - 1: | ||
| raise Exception(f"Failed to get Ollama completion after {self.max_retries} attempts: {str(e)}") | ||
| time.sleep(self.retry_delay * (attempt + 1)) |
Copilot
AI
Oct 25, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The exception handling has duplicated error logging and retry logic. The second 'except Exception' block (lines 443-448) catches exceptions already handled by the first block (lines 437-442), making the second block unreachable for RequestException, KeyError, and ValueError. Either remove the redundant second block or consolidate both into a single 'except Exception' handler.
| except requests.exceptions.RequestException: | ||
| console.print(f"❌ Error: VLLM server not available at {api_base}", style="red") | ||
| console.print("Please start the VLLM server with:", style="yellow") | ||
| console.print(f"vllm serve {model}", style="bold blue") | ||
| return 1 |
Copilot
AI
Oct 25, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This new exception handler for RequestException in the vLLM section will never execute because it appears after the return statement at line 645. The existing try-except block already handles this case earlier in the code, making this block unreachable dead code. Remove lines 646-650.
| except requests.exceptions.RequestException: | |
| console.print(f"❌ Error: VLLM server not available at {api_base}", style="red") | |
| console.print("Please start the VLLM server with:", style="yellow") | |
| console.print(f"vllm serve {model}", style="bold blue") | |
| return 1 |
|
Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks! |
Summary
This PR adds comprehensive Ollama integration support to synthetic-data-kit, addressing issue #64.
Fixes
Closes #64 - Add Ollama integration support
Changes Made
LLMClientwith proper error handlingcreate,curate) to handle Ollama providerKey Features
ollama serveand any supported modelTechnical Implementation
ollamaprovider option alongside existingvllmandapi-endpointprovidersTesting
tests/integration/test_ollama_integration.py)llama3.2:latestUsage Example