-
Notifications
You must be signed in to change notification settings - Fork 162
feat: Add GitHub integration with agent_prompts and github_components #1637
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@@ -0,0 +1,21 @@ | |||
# github-haystack |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be good for us to add examples here in the Readme on how to use or to link to the tutorial/google colab for how to use.
Also another relevant detail I think is that these prompts were optimized using Anthropic models. Could be a useful thing for users to know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
google colab in the cookbook and some more examples in an integration page is what I imagine. The README's we currently don't fill out, for example see: https://pypi.org/project/opensearch-haystack/
Might be a good idea to change that and use a copy of the integrations page. I don't see a good reason to keep it empty but I would prefer a consistent solution. I'll talk to Bilge.
integrations/github/src/haystack_integrations/components/connectors/github/pr_creator.py
Outdated
Show resolved
Hide resolved
integrations/github/src/haystack_integrations/components/connectors/github/repo_viewer.py
Show resolved
Hide resolved
@julian-risch maybe a general comment on the structure here. I see that the prompts aren't being used within the library and I understand they will be used in a future tutorial/colab. I wonder then if it would be helpful to instead pre-assemble the tools within the repo so users could easily import the tools and immediately pass them to an Agent. What do you think? |
integrations/github/src/haystack_integrations/components/connectors/github/file_editor.py
Show resolved
Hide resolved
integrations/github/src/haystack_integrations/components/connectors/github/file_editor.py
Outdated
Show resolved
Hide resolved
integrations/github/src/haystack_integrations/components/connectors/github/issue_viewer.py
Show resolved
Hide resolved
@julian-risch overall this looks really good! I mostly have minor comments and only one larger conceptual one about maybe providing users Tools directly instead of needing to compose them, themselves. I didn't comb through every line since there is a lot, but it's well tested so it's good to go from my perspective! We can always make quick updates to this if things arise and depending on usage. |
class GitHubFileEditorTool(ComponentTool): | ||
""" | ||
A Haystack tool for editing files in GitHub repositories. | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh interesting I wasn't thinking to inherit from ComponentTool but do something like
@component
class GitHubFileEditor:
...
GitHubFileEditorTool = ComponentTool(GitHubFileEditor(), ...)
and then people could import the pre-made GitHubFileEditorTool
but I can see how this version would be more customizable.
def to_dict(self) -> Dict[str, Any]: | ||
""" | ||
Serializes the tool to a dictionary. | ||
|
||
:returns: | ||
Dictionary with serialized data. | ||
""" | ||
return default_to_dict( | ||
self, | ||
name=self.name, | ||
description=self.description, | ||
parameters=self.parameters, | ||
github_token=self.github_token.to_dict(), | ||
repo=self.repo, | ||
branch=self.branch, | ||
raise_on_failure=self.raise_on_failure, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we go this route of inheriting from ComponentTool, we won't be able to use this to_dict
method I think. At least we use different sede methods for Tools with this dict structure
{"type": generate_qualified_class_name(type(self)), "data": serialized}
so we'd probably need to follow that as well right?
Since when deserializing this in a pipeline I believe we will eventually call deserialize_tools_or_toolset_inplace
. Do these methods work in that case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@julian-risch okay interesting that this works since you added the pipeline serialization test.
I still wonder to be consistent if we should not rename init_parameters
to data
in the serialized dict since that appears to be the pattern we use in our other tools and expect in deserialize_tools_or_toolset_inplace
@sjrl I added a test called https://colab.research.google.com/drive/1ktlwQ-CDLGDs2uZXvzgG8XspfjPidYqZ?usp=sharing @sjrl If GitHubFileEditorTool looks good to you, I will add tools for all other components and probably update the directory structure a bit. |
@julian-risch This is related to the change we made to tools to have a new variable called ...
outputs_to_state={
#"message": {"source": "documents", "handler": message_handler}, TODO
"documents": {"source": "documents"},
},
outputs_to_string={"source": "documents", "handler": message_handler}
... |
Related Issues
Related to Add GitHub integration with agent_prompts and github_components #1650
Related to Move Agent from experimental to main haystack-experimental#250
@sjrl brought up that one way to keep example files from the experimental Agent around is an integration chore: Remove Agent after Haystack 2.12 release haystack-experimental#263 (comment)
Proposed Changes:
The idea is to enable users to run the example notebook (or a version with updated imports) after having installed this new integration
How did you test it?
New unit tests and I ran all usage examples successfully with a test repo.
I haven't tested it with the notebook yet, which we would need to update first. (tracked by deepset-ai/haystack-cookbook#183 )
Notes for the reviewer
github_token
parameter toapi_key
for consistency with many other integrations.branch
parameter in the run method, which could also be named ref to make more clear it can also be a tag or commit hash. I prefer keeping the parameter namebranch
.Some components havegithub_token: Optional[Secret] = None,
because they can work without any token while others useSecret.from_env_var("GITHUB_TOKEN")
. I suggest we useSecret.from_env_var("GITHUB_TOKEN", strict=False)
where we currently haveNone
as the default.The internal implementation of the components differs in how they use_get_headers
or_get_request_headers
or define headers inline. We could refactor that.Checklist
fix:
,feat:
,build:
,chore:
,ci:
,docs:
,style:
,refactor:
,perf:
,test:
.