-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Discussion] The selection of Agentic/Taskflow frame #183
Comments
Pydantic-AI could have been an excellent option for us. However, since it's still in beta, adopting it would require updating our codebase, which would increase our workload. Therefore, I suggest going with Agno. That said, Agno's manually controlled workflow has its own pros and cons, so we will need to make some compromises with that approach as well. |
@imbajin sir I went through all the repositories for the agentic frameworks you have mentioned above sorry took quite a while in doing that. I believe we should be looking at crewai or agno only. |
@chiruu12 It would be best if we could do this, not be bound by unnecessary design of the framework, it will also be easier to switch and iterate over time (in fact the |
@imbajin yes sir and as most of these frameworks are open source and them being already lightweight it is easy to go through the whole code though it will take some time, we can use their code at the places where we need I do have a understanding of Agno framework once we decide which framework we will be using I will go through the whole code for the same for the time being I will try to understand the code for hugegraph-ai I went through the docs and will be trying it out. |
Hey @imbajin So for finding the answer of user's question, I used LANGGRAPH. It was really lightweight , Fast and simple to implement it. There were agents -
It was a fixed agentic system, worked at graphs of one kind only I have a question, regarding what is the requirement ? Do we need to create a Interface, where user can create his own agentic system for his Graph , just using drag-drop or prompts (without code) Here is the project :- |
This comment has been minimized.
This comment has been minimized.
I believe for our use case we will have to train to model to also get us the text-to-cypher? As we wont know what kind of input the user is going to give us so for that we will have to fine tune a model for the same as the agent might need to write the cypher query on it's own?
|
@chiruu12 yes, that's a good idea. If we need to create a system, where user can create different agents, each agent will have it's own job , means it's own cypher-query to extract information from the graph. Some agents might do other work like information validator. |
@Aryankb I think most of the people won't be proficient enough to write their own queries I have worked quite a bit with graph rag in my intern and at first even I had a bit trouble in writing those. |
Also @imbajin sir just wanted to know , what is the actual requirement.
And also where we are currently ? |
This comment has been minimized.
This comment has been minimized.
@Aryankb Thanks for your feedback. At present, whether we use The main concern about LangGraph comes from feedback that its performance is poor and its resource consumption is high(Like |
@chiruu12 @Aryankb Here is a brief description of the actual situation. Our implementation and approach earlier was to use both model fine-tuning and user templates simultaneously. (see it ↓ By default, we use the GQL query template to optimize the effect of text2gql.) ![]() General encoder model fine-tuning for |
Good question in fact, we need to provide both of these abilities at the same time, but with a focus on the second point. We can understand that the first one is mainly aimed at novice users(novice here means: people who are not completely unfamiliar with property graphs, and they don't expect a way where simply throwing a Our core focus is on devs with basic vector-db(vector-rag/naive/basic-rag) or Agent systems. Assuming they already have Vector-RAG, how can we better integrate GraphRAG to provide more operability and ease of orchestration? This is also why we provide separate For example, suppose the user is already using They(Devs) can directly modify our pipeline/workflow code, instead of requiring us to provide a fixed "local/global" mode like Microsoft GraphRAG that is not easy to adjust. Our agentization/selection is about how to better and faster achieve this goal. I'm not sure if the explanation is clear, but if there are any questions, please continue to reply:) BTW, we are currently in a transition from 1 to 2 |
@imbajin I would have to suggest either CrewAI or Agno. CrewAI may have a higher memory overhead compared to baseline systems. CrewAI's NLP pipeline with HugeGraph's domain-specific embeddings can improve dynamic intent recognition. Agno's parallel processing capabilities are beneficial for handling high-volume L1 requests efficiently. I'm researching this more in detail, especially looking at the project's upcoming goals. Simply put: CrewAI: Cons: Agno claims performance improvements up to 1000 times that of traditional frameworks but the integration may require custom adapters for HugeGraph's Gremlin/Cypher hybrid interface. This seems like a solid investment into the performance. Adding to the above, LlamaIndex provides a graph-aware retrieval system with a recursive mechanism for hierarchical caching AND It integrates seamlessly with tools like CrewAI to enhance search-based queries and agentic pipelines. Frameworks with active communities and regular updates, like CrewAI and LlamaIndex are practical for use, but if we're leaning more towards a DIY approach, with a focus on performance, Agno should be prioritized. Also to be noted, LlamaIndex MAY lack native support for HugeGraph's distributed computer module, need to look further into it. CrewAI seems like the perfect option, memory efficiency aside. Memory efficiency considered, Agno must be looked into. How about hybrid approaches? |
By hybrid approaches, I mean something like: CrewAI's NLP pipeline enhanced with HugeGraph's domain-specific embeddings |
@imbajin As per my understanding of 2, we need a way to integrate hugegraph agents with any other existing framework. And we will provide an http api layer abstraction or python SDK for hugegraph agents. HG_orchestrator will handle the order of agent execution. User can specify if he want sequential in case next agent needs outputs from any previous agent , parallel otherwise. If we need this, we need a framework having really good Eg :- One naive approach can be to do everything using simple python while loop, here we don't need to manage different dependency conflicts. If (devs) use AutoGen, and if we create HG_agentic SDK using crewai. Then there might be dependency conflicts. If not, then we can follow the below priority order. Priority suggestion for workflow :-
|
@imbajin sir instead of developing a dedicated HG-agentic library, I propose that we make a retriever service similar to what Pinecone provides. Our approach would encompass two distinct modes: |
@imbajin , I agree with @Aryankb's understanding but as for the workflow priority, here are my two cents: graph TD
A[User Query] --> B(Agno L1 Processor)
B -->|Simple Lookup| C[HugeGraph Cache]
B -->|Complex Query| D{CrewAI Orchestrator}
D -->|Multi-Hop| E[LlamaIndex Retriever]
D -->|Computation| F[HugeGraph-Computer]
F -->G
E --> G[Result Aggregator]
G --> H[Pydantic Validator]
H --> I[Output Formatter]
Here, we'll be considering the top four candidate frameworks (CrewAI, Agno, LlamaIndex, Pydantic-AI) against HugeGraph's requirements for implementing an agentic GraphRAG system. Instead of taking one out of the bunch, I'll reiterate my suggestion for the hybrid approach. Deploy Agno for L1 queries CrewAI's Performance Profile (I asked chatgpt for an analysis)
So summing it up here is the proposed architecture, kept simple:
My rationale and research summarized: This proposed architecture is based off of what I saw on the apache's jira, where the required architecture was provided for the upcoming months of development. I also emailed you additional insights for the architecture, please do check ( @imbajin ) |
@Kryst4lDem0ni4s i liked your idea of caching and prometheus monitoring But using different frameworks for different tasks comes up with too many dependency issues. Once i was working with crew ai, and langchain-google-genai library, it took me two days to fix the dependency conflicts by manually tring which version combination works. So it might give hybrid features but its very inconvenient and could lead to |
@Aryankb first of all good point and second this is one of the reasons why I suggested taking the features and writing our own as we can also try to work with the latest dependencies for our framework and it will be not lightweight anymore if will end up using so many frameworks (one of the points that sir made in the starting) I would suggest using pyndatic (not the agentic framework) and a combination of agno and crewai in which we will be building our own agent (not exactly our own because we will be just modifying their code and using for our framework). |
Currently, I've only reviewed the workflow aspects of
The Graph Rag workflow in Next steps:
|
Modifying and creating your own implementations of existing frameworks is quite impractical on such a scale, especially when your reasoning is for a small upgrade of performance optimisation. It's better to use existing libraries and the features they offer instead of creating something from scratch, otherwise it's a separate project altogether. It's easy to utilise the best of all their facilities (CrewAI, Agno, LlamaIndex and Pydantic) if we define the modularity from the get go. So I'd say that from here on out, it's not much of an issue about "which is the best?" Let's focus on defining a scalable architecture and not shy away from experimentation when we have such great resources available. |
I looked further into what @chiruu12 suggests, about not using off-the-shelf agentic components that can prevent developers from understanding critical behaviors, so that over-time the behavior of the service doesnt go out of control. How about combining all of our suggestions that integrates LlamaIndex, Pydantic-AI, CrewFlow, and Agno into a dual-mode, modular GraphRAG system while avoiding the dependency hell as warned by @Aryankb As a hybrid GraphRAG system that supports two modes we can include: Key design principles so that everyone can get a good night's sleep: Architectural Layers & Components: Key Features: This component when abstracted into our own agentic library will be the base of all performance optimizations. B. Orchestration Layer – CrewAI for Complex Workflows Key Features: C. Validation Layer – Pydantic Key Features: Note: This is the general usage of Pydantic, not it's agentic tools. Otherwise it is too unpredictable and unsuitable for production. D. Retrieval Enhancement Layer – LlamaIndex Summary for the plan with general key points and implementation steps:
graph
A[User Query_Input] --> B{HTTP API Gateway}
B --> C[Agno L1 Query Service]
B --> D[CrewFlow Orchestrator]
D --> E[Dynamic Agent Creation]
E --> F[Workflow Execution]
F --> G[Pydantic Validation Middleware]
D --> H[Retrieve Request]
H --> I[LlamaIndex Recursive Retriever]
I --> J[Hybrid Caching Layer_RocksDB_Shared Memory]
G & J --> K[Result Aggregator]
K --> L[HTTP API Gateway_Response]
What are your thoughts on this approach @imbajin ? Further I'd also like your thoughts about what I mentioned regarding LN queries and how we'd go about to handle them. But I'd still stand by what I said, that implementing this is a seperate project in itself and would require lots of time and expertise to do before it can be put into production due to the added complexities of the architecture. |
@Kryst4lDem0ni4s glad to know that you did get my point there actually I am currently looking at the code for the specs I have in mind which is a little different as I would prefer a more simple but easier to understand implementation (atleast for the prototype I want to make currently). |
I searched and looked at the code for each component that mentioned here but I think it will be wiser to make the Architecte simpler as this is just going to increase complexity and time it will take, we need to make something that will fast robust and also easy to understand as even though devs will be able to understand the complex architecture too but they won't be willing to invest that much time to just integrate an agentic retriever which is still a small thing in the greater scheme. Also now days the work is so fast paced that no one will be willing to invest if there will be so many dependencies to deal with and also a complex architecture on top of that... |
@Aryankb Yes, I basically agree with your hypothesis and analysis. The best way for us to proceed is to provide an SDK for developers to better call, although that is not something that needs to be done immediately. And also as the #183 (comment) mentioned We may try a more balanced option first. If we feel that the agent framework is heavy, we can directly reference/use their flow part instead of directly referencing the entire framework (after looking at the workflows of |
@chiruu12 I think it can be done this way. We can eventually implement our own management of agent combinations, but we still need a good workflow library (if we feel that the performance is not good enough, we can consider using Rust to rewrite the underlying layer - similar to Pydanic-core). We just need to ensure consistency in the user interface layer |
@Kryst4lDem0ni4s Thanks for ur graph, and I have fixed the mermaid UI: And some comment about the graph: ![]() As I mentioned above, it is entirely feasible to mix the strengths of multiple frameworks, but we are unlikely to directly introduce the entire framework (such as We are likely to first introduce a relatively balanced framework and integrate the good aspects of other designs for transformation. This is currently my thought |
@chiruu12 Fine, I am usually able to respond to messages in my personal time, but I may not be able to reply in time~ |
Sure, sir I will make a demo for the same as soon as my exams are over sir. |
I see, so then I think it's feasible to side with CrewAI and build further with a manual implementation of the other libraries. Especially since we have to handle L1-LN queries sooner or later, we'll iteratively improve it using the others (Agno, LlamaIndex, etc.) |
@Kryst4lDem0ni4s As for how to try it out, I think you can follow the general process below (for reference only):
|
Okay, I'll get right to it and let you know! Also, I'll come back afterwards to help document this discussion thoroughly once I have improved the overall understanding of the concepts. Thanks for the reference |
Although I have already started the next steps, I will be working on creating a deeper implementation and the logic extraction from the crewai's library based on this implementation of their available code.
Whenever possible, please go through the implementation and please provide feedback so that I can detect mistakes made in the pipeline @imbajin |
Feature Description
Some Context Here:
Our goal at GraphRAG is to focus on providing better I/O capabilities related to LLM/RAG system. We need agentic but will not attempt to create a large and comprehensive (agent/rag) framework, so we tend to choose the highest performance/flexible workflow + agentic framework for integration(Any suggestions and feedback are welcome)
Below is a simplified summary table assisted by LLM. Currently, the focus is on the first 4 (CrewAI / Agno / LLamaIndex / Pydanic-AI), with prioritization to be determined. They generally have their own built-in workflow-like designs and have few dependencies (relatively lightweight).
Table:
Heavyweight Agent Frameworks (omitted)
This section is not core, mainly listing common/well-known frameworks, which can also be studied for good ideas.
The text was updated successfully, but these errors were encountered: