Skip to content

feat: Add GitHub integration with agent_prompts and github_components #1637

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 53 commits into from
May 28, 2025

Conversation

julian-risch
Copy link
Member

@julian-risch julian-risch commented Apr 10, 2025

Related Issues

Proposed Changes:

  • Move github_components from experimental to a new integration
  • Move agent_prompts from experimental to a new integration
  • Add tools that wrap the new components

The idea is to enable users to run the example notebook (or a version with updated imports) after having installed this new integration.

How did you test it?

New unit tests and I ran all usage examples successfully with a test repo.

I have tested it with the updated notebook. I'll update the cookbook PR once the integration is released: deepset-ai/haystack-cookbook#183

Notes for the reviewer

  • I suggest we rename github_token parameter to api_key for consistency with many other integrations.
  • While we could find a way to set up integration tests, I would rather leave them out of this PR.
  • GithubRepositoryViewer has a branch parameter in the run method, which could also be named ref to make more clear it can also be a tag or commit hash. I prefer keeping the parameter name branch.
  • Some components have github_token: Optional[Secret] = None, because they can work without any token while others use Secret.from_env_var("GITHUB_TOKEN"). I suggest we use Secret.from_env_var("GITHUB_TOKEN", strict=False) where we currently have None as the default.
  • The internal implementation of the components differs in how they use _get_headers or _get_request_headers or define headers inline. We could refactor that.

Checklist

@github-actions github-actions bot added the type:documentation Improvements or additions to documentation label Apr 10, 2025
@julian-risch julian-risch marked this pull request as ready for review April 25, 2025 10:28
@julian-risch
Copy link
Member Author

julian-risch commented May 27, 2025

@sjrl Finally ready for another review! We're using "data" instead of "init_parameters" in serialization now and all newly implemented tools expose outputs_to_string, inputs_from_state, and outputs_to_state as init parameters.
I tested this by running the updated notebook. The Agent forked the repo and committed to a branch.

What do you think about the parameter name github_token? In almost every other place, we use api_key, so for consistency with the many other integrations, we could rename github_token to api_key or leave it as is.

@julian-risch julian-risch requested a review from sjrl May 27, 2025 11:46
Comment on lines 18 to 19
:param name: Optional name for the tool.
:param description: Optional description.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do realize this is a bit confusing in our tools, but it seems that if we define a __init__ then these docstrings are put under the __init__ def. If there is no __init__ defined like in Tool then we put it in the class description.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, of course! Do you think we should a usage example here then (in addition to moving the param docstrings to the init)? I realized that's missing too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah if it's not too much to ask, a usage example would be great!

Copy link
Contributor

@sjrl sjrl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Only some minor comments left

@julian-risch julian-risch merged commit 3095079 into main May 28, 2025
11 checks passed
@julian-risch julian-risch deleted the move-github-components branch May 28, 2025 10:12
Amnah199 pushed a commit that referenced this pull request Jun 4, 2025
* add agent_prompts and github_components

* rename to github_haystack

* remove github-haystack

* renamed integration, added components dir

* add tests, pydoc, update pyproject.toml

* add workflow

* fmt

* fmt

* lint

* ruff

* fmt

* lint:all

* replace StrEnum for py 3.9+ compatibility

* move files

* fix tests

* lint

* fix pydoc and extend init files

* Add integration:github to labeler.yml

* unify how we set GITHUB_TOKEN in tests

* fix 3 usage examples. 3 remaining

* remove empty lines from prompts

* GitHub capitalization

* add license header

* all caps for prompts

* add GitHubFileEditorTool

* enforce kwargs instead of positional args

* use _get_request_headers and base_headers consistently

* lint

* rename GitHubRepositoryViewer to GitHubRepoViewer

* lint

* add pipeline serialization test

* extend pipeline to_dict test

* set default branch of repo viewer

* lint

* add four more tools

* lint

* rename prompts

* add tests for four more tools

* rename context prompt

* add outputs_to_state as param to GitHubFileEditorTool

* add outputs_to_state as param to GitHubRepoViewerTool

* set default outputs_to_state for GitHubRepoViewerTool

* extract serialize_handlers to utils; don't use mutable defaults

* replace init_parameters with data for serde in FileEditor, RepoViewer

* add outputs_to_state to GitHubIssueCommenterTool; replace init_parameters with data

* add outputs_to_state to GitHubIssueViewerTool; replace init_parameters with data

* add outputs_to_state to GitHubPRCreatorTool; replace init_parameters with data

* move param docstrings to init methods

* use generate_qualified_class_name instead of hardcoded name

* test with lowest supported version

* don't test http_client_kwargs for compatibility with Haystack 2.12
Amnah199 added a commit that referenced this pull request Jun 4, 2025
* feat(azure-ai-search): Allow full metadata field customization

So far, the `metadata_fields` init parameter only allowed a few custom
simple value types to be mapped (e.g., no nested metadata) and also
hardcoded the fields to be only `filterable` (but not `searchable`
or `facetable`, for instance).

For full flexibility, allow an Azure AI Search `SearchField` instance
to be passed as mapping instead of a Python type.

* PR comments

* feat: Add OpenRouter integration (#1723)

* Add openrouter integration

* Add tests for chat generator and support extra headers

* Add async tests

* Fix config files

* Add example

* Fixes

* Fix read me

* PR comments

* Small fixes

* Updated labeler and README

* Update docstrings

* Add user agent to Azure AI Search(#1743)

* docs: update changelog for integrations/azure_ai_search (#1745)

* Update changelog for integrations/azure_ai_search

* Update CHANGELOG.md

---------


Co-authored-by: Amna Mubashar <[email protected]>

* docs: ChatMessage examples (#1752)

* feat: Support Llama API as a Chat Generator (#1742)

* init: llama-api chat generator

* docs: update comments for LlamaChatGenerator

* feat: add keyword only *

* fix: replace streaming_callback type

* fix: add Toolset for tools

* fix: rm unused typing

* docs: add meta header

* docs: fix comments to llama api

* docs: add meta header

* docs: add meta header

* fix: rename LlamaChat to MetaLlamaChat

* docs: add meta header

* docs: align doc format

* add workflow for nightly tests

* add meta_llama to labeler

* add new integration to repo readme overview table

* replace .llama.chat. with .meta_llama.chat.

* fmt

* replace llama with meta_llama in pydocs

---------

Co-authored-by: Julian Risch <[email protected]>

* Update changelog for integrations/meta_llama (#1754)

Co-authored-by: julian-risch <[email protected]>

* chore(deps): bump fossas/fossa-action from 1.6.0 to 1.7.0 (#1750)

Bumps [fossas/fossa-action](https://github.com/fossas/fossa-action) from 1.6.0 to 1.7.0.
- [Release notes](https://github.com/fossas/fossa-action/releases)
- [Commits](fossas/fossa-action@v1.6.0...v1.7.0)

---
updated-dependencies:
- dependency-name: fossas/fossa-action
  dependency-version: 1.7.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Update how skipping works (#1756)

* test: Ollama - make test_run_with_response_format more robust (#1757)

* feat: adapt `OllamaGenerator` metadata to OpenAI format (#1753)

* feat: adapt Ollama metadata to OpenAI format in `OllamaGenerator`

* Add: `OllamaGenerator` support in Langfuse

* Ran Linters

* Revert "Add: `OllamaGenerator` support in Langfuse"

This reverts commit 1f399e0.

---------

Co-authored-by: Sebastian Husch Lee <[email protected]>

* Add: `OllamaGenerator` support in Langfuse (#1759)

* Update changelog for integrations/ollama (#1761)

Co-authored-by: sjrl <[email protected]>

* Update changelog for integrations/langfuse (#1762)

Co-authored-by: sjrl <[email protected]>

* docs: update changelog for integrations/openrouter (#1763)

* Update changelog for integrations/openrouter


---------

Co-authored-by: Amnah199 <[email protected]>
Co-authored-by: Amna Mubashar <[email protected]>

* chore: fix README for meta-llama (#1766)

* chore(deps): bump aws-actions/configure-aws-credentials (#1751)

Bumps [aws-actions/configure-aws-credentials](https://github.com/aws-actions/configure-aws-credentials) from 4.2.0 to 4.2.1.
- [Release notes](https://github.com/aws-actions/configure-aws-credentials/releases)
- [Changelog](https://github.com/aws-actions/configure-aws-credentials/blob/main/CHANGELOG.md)
- [Commits](aws-actions/configure-aws-credentials@f24d719...b475783)

---
updated-dependencies:
- dependency-name: aws-actions/configure-aws-credentials
  dependency-version: 4.2.1
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* ci: Bedrock - improve worfklow; skip tests from CI (#1773)

* feat: OllamaChatGenerator - add Toolset support (#1765)

* Add Toolset support to OllamaChatGenerator

* Lint

* Lambdas are not serializable

* Lint

* Generate tool call id if not available

* Lint

* Revert back to not using ToolCall id

* Lint

* Update changelog for integrations/ollama (#1775)

Co-authored-by: vblagoje <[email protected]>

* feat: MCPTool and MCPToolset async resource management improvements (#1758)

* Add MCPClientSessionManager to connect/close mcp clients

* Update and refactor mcp tests

* More descriptive connection error raising

* Proper test cleanup

* Testing CI windows

* linting

* Improve connection error raise

* PR feedback

* Proper naming, and more precise cleanup sequence

---------

Co-authored-by: Michele Pangrazzi <[email protected]>

* test: add service_tier to test_convert_anthropic_chunk_to_streaming_chunk (#1778)

* fix: Bring Mistral integration up to date with changes made to OpenAIChatGenerator and OpenAI Embedders (#1774)

* Bringing Mistral up to date

* Fix Mistral Embedders to be deserializable

* Fix lint

* Fix lint

* Bump minimum haystack version

* Update changelog for integrations/mistral (#1781)

Co-authored-by: sjrl <[email protected]>

* feat: Add `to_dict` to `STACKITDocumentEmbedder` and `STACKITTextEmbedder` and more init parameters from underlying OpenAI classes (#1779)

* Add to_dicts and more tests

* Bumpy haystack version

* Add changes to chat generator as well

* Update changelog for integrations/stackit (#1782)

Co-authored-by: sjrl <[email protected]>

* feat: add run_async for CohereChatGenerator (#1689)

* CohereChatGenerator async support

* Tests and linter fixes

* fix

* refinements

* refactor + tests reorgani    |             ^^^^^ T201

* rename test

* remove markers

* reformat

* fix

* minor fixes

* Trigger CI

---------

Co-authored-by: anakin87 <[email protected]>

* Update changelog for integrations/cohere (#1784)

Co-authored-by: anakin87 <[email protected]>

* docs: update changelog for integrations/google_ai (#1812)

* Update changelog for integrations/google_ai

* Update CHANGELOG.md

---------

Co-authored-by: wochinge <[email protected]>
Co-authored-by: Stefano Fiorucci <[email protected]>

* fix: Fix exposing Qdrant api-key in `metadata` field when running `to_dict` (#1813)

* Add to_dict test

* Add more type hints

* More type hints

* Add fix for exposing api key in metadata when running to_dict

* Add unit test

* PR comments

* Update changelog for integrations/qdrant (#1814)

Co-authored-by: sjrl <[email protected]>

* ci: check lowest direct dependencies (#1788)

* ci: check lowest direct dependencies

* try single quotes

* debug

* debugging

* try chroma

* no bedrock

* retry

* explicit option

* don't run tests

* debug 1

* try output file

* more

* no deepeval

---------

Co-authored-by: David S. Batista <[email protected]>

* build: add pins for Anthropic (#1811)

* build: add pins for Anthropic

* rm file incorrectly added

* Update changelog for integrations/anthropic (#1815)

Co-authored-by: anakin87 <[email protected]>

* build: add pins for Vertex (#1810)

* Update changelog for integrations/google_vertex (#1816)

Co-authored-by: anakin87 <[email protected]>

* build: add pins for Cohere (#1817)

* Update changelog for integrations/cohere (#1829)

Co-authored-by: anakin87 <[email protected]>

* build: remove pin for Deepeval (#1826)

* Update changelog for integrations/deepeval (#1830)

Co-authored-by: anakin87 <[email protected]>

* feat: Add streamable-http transport MCP support (#1777)

* Add streamable-http transport

* Improve error message for tool invocation

* Add streamable MCPTool example, update examples

* Improve examples

* Add unit tests

* Update integrations/mcp/examples/mcp_client.py

Co-authored-by: Amna Mubashar <[email protected]>

* initialize vars outside try block

* Small fix

* Fix linting

---------

Co-authored-by: Amna Mubashar <[email protected]>

* Update changelog for integrations/mcp (#1831)

Co-authored-by: vblagoje <[email protected]>

* build: pining lower versions of haystack and `aiohttp` for `ElasticSearch` (#1827)

* pining lower versions

* adding missing comma

* pinning to >=2.4.0

* pinning to >=2.3.0

* pinning aiohttp to >=3.0.0

* pinning aiohttp to >=2.0.0

* pinning aiohttp to >=2.5.0

* pinning aiohttp to >=2.6.0

* pinning aiohttp to >=3.0.0

* pinning aiohttp to >=3.1.0

* pinning aiohttp to >=3.2.0

* pinning aiohttp to >=3.3.0

* pinning aiohttp to >=3.10.0

* pinning aiohttp to >=3.9.0

* pinning aiohttp to >=3.8.0

* reverting back aiohttp to 3.9.0

* Update changelog for integrations/elasticsearch (#1834)

Co-authored-by: davidsbatista <[email protected]>

* build: add Jina pins (#1836)

* Update changelog for integrations/jina (#1838)

Co-authored-by: anakin87 <[email protected]>

* build: add Langfuse pins (#1837)

* Update changelog for integrations/langfuse (#1839)

Co-authored-by: anakin87 <[email protected]>

* build: pin version for `pymongo` and `haystack` in MongoDB integration (#1832)

* pinning to older version o haystack and mongodb

* pinining haystack and pymongo

* wip

* fixing format

* adding missing CI job

* making sure lowest version of pymongo has the async client

* making sure lowest version of pymongo has the async client

* versioning

* haysack 2.9

* haysack 2.10

* haysack 2.11

* Remove failing test. No need to have it here since it's already tested in haystack main. (#1842)

* ci: Missing labels for stackit and anthropic (#1844)

* Missing labels for stackit and anthropic

* PR comments

* build: app pins for MCP (#1845)

* Update changelog for integrations/mongodb_atlas (#1840)

Co-authored-by: davidsbatista <[email protected]>
Co-authored-by: David S. Batista <[email protected]>

* docs: update changelog for integrations/mcp (#1848)

* Update changelog for integrations/mcp

* Update CHANGELOG.md

---------

Co-authored-by: anakin87 <[email protected]>
Co-authored-by: Stefano Fiorucci <[email protected]>

* build: add pins for Pgvector (#1849)

* Update changelog for integrations/pgvector (#1850)

Co-authored-by: anakin87 <[email protected]>

* build: add pins for Optimum (#1847)

* build: add pins for Optimum

* try with python 3.13

* don't call HF on unit tests

* Update changelog for integrations/optimum (#1852)

Co-authored-by: anakin87 <[email protected]>

* build: add pins for Qdrant (#1853)

* build: add pins for Pinecone (#1851)

Co-authored-by: David S. Batista <[email protected]>

* Update changelog for integrations/pinecone (#1855)

Co-authored-by: anakin87 <[email protected]>

* docs: update changelog for integrations/qdrant (#1856)

* Update changelog for integrations/qdrant

* Update CHANGELOG.md

---------

Co-authored-by: anakin87 <[email protected]>
Co-authored-by: Stefano Fiorucci <[email protected]>

* chore: review license compliance workflow (#1843)

* chore: review license compliance workflow

* refactor

* deepeval

* build: add pins for Ragas (#1854)

* feat: Add GitHub integration with components, tools, and prompts (#1637)

* add agent_prompts and github_components

* rename to github_haystack

* remove github-haystack

* renamed integration, added components dir

* add tests, pydoc, update pyproject.toml

* add workflow

* fmt

* fmt

* lint

* ruff

* fmt

* lint:all

* replace StrEnum for py 3.9+ compatibility

* move files

* fix tests

* lint

* fix pydoc and extend init files

* Add integration:github to labeler.yml

* unify how we set GITHUB_TOKEN in tests

* fix 3 usage examples. 3 remaining

* remove empty lines from prompts

* GitHub capitalization

* add license header

* all caps for prompts

* add GitHubFileEditorTool

* enforce kwargs instead of positional args

* use _get_request_headers and base_headers consistently

* lint

* rename GitHubRepositoryViewer to GitHubRepoViewer

* lint

* add pipeline serialization test

* extend pipeline to_dict test

* set default branch of repo viewer

* lint

* add four more tools

* lint

* rename prompts

* add tests for four more tools

* rename context prompt

* add outputs_to_state as param to GitHubFileEditorTool

* add outputs_to_state as param to GitHubRepoViewerTool

* set default outputs_to_state for GitHubRepoViewerTool

* extract serialize_handlers to utils; don't use mutable defaults

* replace init_parameters with data for serde in FileEditor, RepoViewer

* add outputs_to_state to GitHubIssueCommenterTool; replace init_parameters with data

* add outputs_to_state to GitHubIssueViewerTool; replace init_parameters with data

* add outputs_to_state to GitHubPRCreatorTool; replace init_parameters with data

* move param docstrings to init methods

* use generate_qualified_class_name instead of hardcoded name

* test with lowest supported version

* don't test http_client_kwargs for compatibility with Haystack 2.12

* build: pinning `unstructured` to lowest working versions (#1841)

* finding lowest working versions

* adding missing CI job

* adding missing limitation

* feat: AnthropicChatGenerator - add Toolset support (#1787)

* AnthropicChatGenerator - add Toolset support

* Use new serialization method for tools

* Update haystack dep to 2.13.1 which includes Toolset

* Small update

* buid: add pins for Snowflake + small refactoring (#1860)

* Update changelog for integrations/snowflake (#1862)

Co-authored-by: anakin87 <[email protected]>

* Update changelog for integrations/ragas (#1857)

Co-authored-by: anakin87 <[email protected]>

* Update changelog for integrations/unstructured (#1861)

Co-authored-by: davidsbatista <[email protected]>

* build: add pins for Nvidia (#1846)

* Update changelog for integrations/nvidia (#1863)

Co-authored-by: anakin87 <[email protected]>

* build: add pins for Google AI (#1828)

* docs: update changelog for integrations/google_ai (#1864)

* Update changelog for integrations/google_ai

* Update CHANGELOG.md

---------

Co-authored-by: anakin87 <[email protected]>
Co-authored-by: Stefano Fiorucci <[email protected]>

* Update changelog for integrations/anthropic (#1865)

Co-authored-by: vblagoje <[email protected]>

* docs: Update changelog for integrations/github (#1858)

Co-authored-by: julian-risch <[email protected]>

* feat: adding an `HybridRetriever` as a `Supercomponent` having `OpenSearch` as the document store (#1701)

* adding tests

* linting and typing

* adding env variable

* env variable

* extending docstring

* removing generation part

* updating tests

* adding a run test with mocked sentence_transformers

* fixing format

* refactor: use `component_to_dict` in OpenSearchHybridRetriever (#1866)

* Update changelog for integrations/opensearch (#1867)

Co-authored-by: davidsbatista <[email protected]>

* oshr-docs (#1868)

* refactor: OpenSearchHybridRetriever use `deserialize_chatgenerator_inplace` (#1870)

* test to use deserialize_chatgenerator_inplace

* removing unused imports

* using deserialize_chatgenerator_inplace

* Update integrations/opensearch/src/haystack_integrations/components/retrievers/opensearch/open_search_hybrid_retriever.py

* Update changelog for integrations/opensearch (#1874)

Co-authored-by: davidsbatista <[email protected]>

* feat: add run_async support for CohereTextEmbedder (#1873)

* feat: add run_async support for CohereTextEmbedder

* fix: review comments

* feat: Add Google GenAI GoogleGenAIChatGenerator (#1875)

* Initial work

* Remove utils

* Add async support

* Async test issue

* Simplify async tests

* Linting

* Improve comment

* Linting

* Improve pyproject.toml

* Add new google genai integration to workflow

* Add labeler

* Add pydoc

* Pin deps

* Pin google-genai dep

* Update integrations/google_genai/src/haystack_integrations/components/generators/google_genai/chat/chat_generator.py

Co-authored-by: Sebastian Husch Lee <[email protected]>

* Update integrations/google_genai/src/haystack_integrations/components/generators/google_genai/chat/chat_generator.py

Co-authored-by: Sebastian Husch Lee <[email protected]>

* PR feedback

* Add system message comment

* Leave only minimal working examples in README

* Update integrations/google_genai/src/haystack_integrations/components/generators/google_genai/chat/chat_generator.py

Co-authored-by: Julian Risch <[email protected]>

* Update integrations/google_genai/src/haystack_integrations/components/generators/google_genai/chat/chat_generator.py

Co-authored-by: Julian Risch <[email protected]>

* Linting

---------

Co-authored-by: Sebastian Husch Lee <[email protected]>
Co-authored-by: Julian Risch <[email protected]>

* Update changelog for integrations/google_genai (#1886)

Co-authored-by: vblagoje <[email protected]>

* feat: Use Langfuse local to_openai_dict_format function to serialize messages (#1885)

* Use Langfuse local to_openai_dict_format function to serialize messages

* Linting

* PR feedback

* Add detailed tracing for GoogleGenAIChatGenerator (#1887)

* docs: update changelog for integrations/langfuse (#1888)

* Update changelog for integrations/langfuse

* Update CHANGELOG.md

---------

Co-authored-by: vblagoje <[email protected]>
Co-authored-by: Vladimir Blagojevic <[email protected]>

* try reenabling pinecone tests (#1871)

* PR comments

* Small updates

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: Denis Washington <[email protected]>
Co-authored-by: Denis Washington <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
integration:github topic:CI type:documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants