📚 Reasonlytics: AI-Powered Data Analysis Agent

Transform Your Data into Actionable Insights with Natural Language

🎯 Project Description

Reasonlytics is an intelligent data analysis agent that bridges the gap between raw data and meaningful insights. Built on LangGraph and powered by local LLMs via Ollama, Reasonlytics enables users to interact with their datasets using natural language queries and receive comprehensive analysis with human-readable explanations.

What Makes Reasonlytics Special

Reasonlytics combines the power of multiple AI agents working in harmony to deliver a seamless data analysis experience:

🧠 Intelligent Query Understanding: Automatically classifies whether you want visualizations or data analysis
🐍 Dynamic Code Generation: Creates optimized pandas code tailored to your specific dataset and query
⚡ Safe Code Execution: Runs analysis in a secure, isolated environment
📊 Smart Visualization: Generates charts and plots when requested
💡 Contextual Reasoning: Provides clear, business-friendly explanations of results
🔍 Instant Dataset Insights: Automatically analyzes new datasets and suggests exploration questions

🏗️ Technical Architecture

The system follows a modular, agent-based architecture powered by LangGraph, open-source LLMs, and Pandas for structured, explainable data analysis workflows.

LangGraph Agent Pipeline

Orchestrates the multi-step reasoning flow using MessagesState
Executes 8 interconnected nodes — covering data ingestion, insight generation, query classification, code synthesis, execution, and result explanation
Employs custom @tool decorators for modular tool execution (e.g., DataFrameSummaryTool, DataInsightAgent, CodeExecutionTool)
Deterministic routing ensures reproducible outputs (DATA_INPUT → INSIGHT → QUERY → CODE_GEN → EXECUTION → EXPLANATION)

LLM Integration

Model: Qwen2.5-Coder-7B-Instruct-Q4_K_M (configurable open-source LLM)
Inference Engine: Ollama / vLLM for local and GPU-accelerated deployments
Prompt Templates: Modular prompt blocks for summary, query classification, and code generation
Output Parsing: Structured text extraction with safety filtering for Python and visualization code

Data Handling Layer

Data Engine: Pandas DataFrame (loaded from CSV, Excel, or SQL sources)
Schema Summary: Auto-generated dataset overview including types, missing values, and size metadata
Validation: Pre-execution code checks ensure only safe read-only operations (no file I/O, no external writes)

Visualization & Explanation Layer

Renderer: Matplotlib / Seaborn for chart generation
Result Display: Inline rendering of visual and textual insights
Reasoning Layer: Generates natural-language explanations summarizing trends, outliers, and actionable insights

Observability & Extensibility

Integrated logging for every node step with timestamped traces
Easily extensible to support other LLMs (Mistral, Gemma2, Llama3)
Can integrate with external data APIs or cloud storage connectors

✨ Key Features

🗣️ Natural Language Interface: Ask data-driven questions like “Show me sales trends by region” or “What’s the correlation between price and sales?” — no coding required. The agent understands intent and translates your query into executable Python or SQL automatically.
🤖 Automated Insights & Reasoning: On every dataset upload or query, the agent instantly summarizes key patterns and relationships. It also provides clear explanations of results (e.g., “North region leads with 35% of total sales”) using an integrated Reasoning LLM.
📊 Multi-Modal Data Analysis: Seamlessly handles both data analytics and visualization requests. The system dynamically decides whether to return a table, chart, or statistical summary based on the query context.
💡 Code Transparency & Safe Execution: Every output comes with the generated pandas/matplotlib code for verification and learning. Code execution is sandboxed to ensure complete safety and prevent unauthorized operations.
🔒 Local, Private, and Configurable: Runs entirely on your own infrastructure using Ollama and LangGraph, ensuring full data privacy. Supports multiple open-source LLMs like Qwen2.5, CodeGemma, Llama 3, and Mistral, with easy configuration for different workflows.

📁 Project Structure

📦 llm-data-analyst-agent-langgraph-ollama
│
├── configuration.py        # Environment setup and configurations
├── FastAPI.py              # API layer for backend integration
├── compile_agent.py        # Core LangGraph workflow
├── streamlit_app.py         # Streamlit frontend for user interaction
├── agent_tools.py            # Core agent and tools logic
└── README.md
└── License

💡 Use Cases

📈 Sales Performance Analysis: “Show me total revenue by product category for the last quarter.”

The agent automatically aggregates the data, generates a bar chart, and explains key insights — such as which regions or products drive the highest revenue.

🏪 Retail Demand Forecasting: “Visualize weekly sales trends for top 5 products.”

The agent produces time-series plots, highlights seasonal patterns, and provides a reasoning summary to support inventory or marketing decisions.

👩‍💼 HR Analytics Dashboard: “What’s the average salary by department?” or “Plot employee attrition by age group.”

The agent creates pandas aggregations and visual insights to help HR teams identify trends and optimize workforce planning.

💰 Financial Data Insights: “Compare average returns across investment portfolios” or “Show me expense distribution by category.”

It generates precise visual summaries and explains financial performance differences in natural language.

🧠 Exploratory Data Analysis (EDA) Assistant: “Give me a quick summary and possible questions to explore.”

The agent detects schema, missing values

🧭 Demo Sample Images

Streamlit Interface

🛠️ Installation Instructions

Prerequisites

Python 3.10+
CUDA-compatible GPU (optional, for faster processing)
8GB+ RAM recommended

Step 1: Clone Repository

git clone https://github.com/Ginga1402/llm-data-analyst-agent-langgraph-ollama.git
cd llm-data-analyst-agent-langgraph-ollama

Step 2: Install Dependencies

pip install -r requirements.txt

Step 3: Set Up Ollama (LLM Backend)

# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh

# Pull the required model
ollama pull codegemma:7b-instruct-v1.1-q4_K_S

Step 4: Configure Paths

Update the paths in configuration.py to match your system:

model_name = "qwen2.5-coder:7b-instruct-q4_K_M"

📖 Usage

Starting the Application

Start the FastAPI Server:

python FastAPI.py

The API will be available at http://localhost:8000

Launch the Streamlit Interface:

streamlit run streamlit_app.py

The web interface will open at http://localhost:8501

⚙️ Workflow Graph:

🧱 Technologies Used

Technology	Description	Link
LangChain	Framework for building LLM-driven applications and chains	LangChain
LangGraph	State-based agent orchestration for complex LLM workflows	LangGraph
Ollama	Local LLM inference engine for privacy-focused AI	Ollama
Mistral 7B (Q4_K_M)	Quantized instruction-tuned model for query classification	Mistral AI
Qwen2.5-Coder 7B (Q4_K_M)	Specialized code generation model for pandas operations	Qwen Models
Qwen2.5 7B (Q4_K_M)	General-purpose reasoning model for data insights	Qwen Models
Pandas	Data manipulation and analysis library for Python	Pandas
Matplotlib	Comprehensive plotting library for data visualization	Matplotlib
Streamlit	Web framework for building interactive data applications	Streamlit
PyTorch	Deep learning framework with CUDA support	PyTorch
NumPy	Fundamental package for scientific computing	NumPy
FastAPI	High-performance API framework for Python	FastAPI
Pydantic	Data validation using Python type annotations	pydantic.dev

🤝 Contributing

Contributions to this project are welcome! If you have ideas for improvements, bug fixes, or new features, feel free to open an issue or submit a pull request.

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🌟 Star History

If you find Reasonlytics useful, please consider giving it a star ⭐ on GitHub!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

📚 Reasonlytics: AI-Powered Data Analysis Agent

Transform Your Data into Actionable Insights with Natural Language

🎯 Project Description

What Makes Reasonlytics Special

🏗️ Technical Architecture

LangGraph Agent Pipeline

LLM Integration

Data Handling Layer

Visualization & Explanation Layer

Observability & Extensibility

✨ Key Features

📁 Project Structure

💡 Use Cases

🧭 Demo Sample Images

🛠️ Installation Instructions

Prerequisites

Step 1: Clone Repository

Step 2: Install Dependencies

Step 3: Set Up Ollama (LLM Backend)

Step 4: Configure Paths

📖 Usage

Starting the Application

⚙️ Workflow Graph:

🧱 Technologies Used

🤝 Contributing

📄 License

🌟 Star History

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
License		License
agent_tools.py		agent_tools.py
compile_agent.py		compile_agent.py
configuration.py		configuration.py
readme.md		readme.md
requirements.txt		requirements.txt

Ginga1402/llm-data-analysis-agent-langgraph-ollama

Folders and files

Latest commit

History

Repository files navigation

📚 Reasonlytics: AI-Powered Data Analysis Agent

Transform Your Data into Actionable Insights with Natural Language

🎯 Project Description

What Makes Reasonlytics Special

🏗️ Technical Architecture

LangGraph Agent Pipeline

LLM Integration

Data Handling Layer

Visualization & Explanation Layer

Observability & Extensibility

✨ Key Features

📁 Project Structure

💡 Use Cases

🧭 Demo Sample Images

🛠️ Installation Instructions

Prerequisites

Step 1: Clone Repository

Step 2: Install Dependencies

Step 3: Set Up Ollama (LLM Backend)

Step 4: Configure Paths

📖 Usage

Starting the Application

⚙️ Workflow Graph:

🧱 Technologies Used

🤝 Contributing

📄 License

🌟 Star History

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages