feat(llm):improve some RAG function UT(tests) #192

yanchaomei · 2025-03-05T09:35:33Z

Comprehensive Test Suite Implementation for HugeGraph-LLM

This PR implements a complete test suite for the HugeGraph-LLM project, covering all major components and ensuring code quality and reliability.

Summary of Test Implementation

1. Test Infrastructure

Created run_tests.py script for easy test execution
Implemented conftest.py with test configuration and fixtures
Added test utilities in test_utils.py for common testing functions
Set up test data directories with sample documents, schemas, and prompts

2. Document Processing Tests

test_document.py: Tests for document module imports and basic functionality
test_document_splitter.py: Tests for document chunking in different languages
test_text_loader.py: Tests for loading text files with various encodings

3. Integration Tests

test_graph_rag_pipeline.py: End-to-end tests for graph-based RAG pipeline
test_kg_construction.py: Tests for knowledge graph construction from documents
test_rag_pipeline.py: Tests for standard RAG pipeline functionality

4. Middleware Tests

test_middleware.py: Tests for FastAPI middleware components

5. Model Tests

LLM Tests:
- test_openai_client.py: Tests for OpenAI API integration
- test_qianfan_client.py: Tests for Baidu Qianfan API integration
- test_ollama_client.py: Tests for Ollama local model integration
Embedding Tests:
- test_openai_embedding.py: Tests for OpenAI embedding functionality
- test_ollama_embedding.py: Tests for Ollama embedding functionality
Reranker Tests:
- test_cohere_reranker.py: Tests for Cohere reranking API
- test_siliconflow_reranker.py: Tests for SiliconFlow reranking API
- test_init_reranker.py: Tests for reranker initialization

6. Operator Tests

Common Operations:
- test_check_schema.py: Tests for schema validation
- test_merge_dedup_rerank.py: Tests for result merging and reranking
- test_nltk_helper.py: Tests for NLP utilities
- test_print_result.py: Tests for result output formatting
Document Operations:
- test_chunk_split.py: Tests for document chunking strategies
- test_word_extract.py: Tests for keyword extraction
HugeGraph Operations:
- test_commit_to_hugegraph.py: Tests for graph data writing
- test_fetch_graph_data.py: Tests for graph data retrieval
- test_graph_rag_query.py: Tests for graph-based RAG queries
- test_schema_manager.py: Tests for graph schema management
Index Operations:
- test_build_gremlin_example_index.py: Tests for Gremlin example indexing
- test_build_semantic_index.py: Tests for semantic indexing
- test_build_vector_index.py: Tests for vector index construction
- test_gremlin_example_index_query.py: Tests for querying Gremlin examples
- test_semantic_id_query.py: Tests for semantic ID queries
- test_vector_index_query.py: Tests for vector index queries
LLM Operations:
- test_gremlin_generate.py: Tests for Gremlin query generation
- test_keyword_extract.py: Tests for LLM-based keyword extraction
- test_property_graph_extract.py: Tests for property graph extraction

Testing Approach

The test suite employs several testing strategies:

Unit Tests: Testing individual components in isolation
Integration Tests: Testing interactions between components
Mock Testing: Using mocks to simulate external dependencies
Parametrized Tests: Testing with various input combinations
Exception Testing: Verifying proper error handling

Key Features

Comprehensive Coverage: Tests for all major modules and components
External Service Handling: Tests can skip external service dependencies when needed
Mock Implementations: Provides mock implementations for external services
Test Data: Includes sample data for consistent test execution
Isolation: Tests are designed to run independently without side effects

Results

All tests pass successfully, ensuring the reliability and correctness of the HugeGraph-LLM codebase. The test suite provides a solid foundation for future development and helps maintain code quality as the project evolves.

fix apache#167

imbajin · 2025-03-05T10:41:58Z

hugegraph-llm/run_tests.py

@@ -0,0 +1,106 @@
+#!/usr/bin/env python3


seems we don't need it?

Also check other CI check, THX~

Also we should enable the test in the related CI file: (So it could run automatically)
like add a .github/workflows/graph_rag.yml ?

could refer:

incubator-hugegraph-ai/.github/workflows/hugegraph-python-client.yml

Line 66 in ca28faf

- name: Test with pytest

get it~ I will do it soon

imbajin · 2025-03-06T07:59:59Z

.github/workflows/hugegraph-llm.yml

+        export PYTHONPATH=$(pwd)/hugegraph-llm/src
+        export SKIP_EXTERNAL_SERVICES=true
+        cd hugegraph-llm
+        python -m pytest src/tests/integration/test_graph_rag_pipeline.py -v


Note each file should have a EOF line (U could config it in your IDE's settings)

So as others files

https://github.com/apache/incubator-hugegraph-ai/actions/runs/13693587346/job/38291894859?pr=192

And could check the CI status here (U could submit a PR in your own repo, select the upstream branch like
yanchaomei:main to test it separately)

Also better not use main/master as your default branch, keep it clean & it could sync the code with upstream
easily(one-click), if u want to modify some code u could checkout a new branch from main like dev-xx (This can avoid many potential conflicts and inconsistencies in the future, and also maintain clarity in using Git)

feat(llm):improve some RAG function UT(tests)

ba85fbc

fix apache#167

github-actions bot added the llm label Mar 5, 2025

dosubot bot added size:XXL This PR changes 1000+ lines, ignoring generated files. enhancement New feature or request labels Mar 5, 2025

imbajin reviewed Mar 5, 2025

View reviewed changes

imbajin and others added 3 commits March 5, 2025 18:42

Merge branch 'main' into main

aabac09

add hugegraph-llm.yml

a012cb2

Merge branch 'main' of github.com:yanchaomei/incubator-hugegraph-ai

da5b6c0

imbajin reviewed Mar 6, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(llm):improve some RAG function UT(tests) #192

feat(llm):improve some RAG function UT(tests) #192

yanchaomei commented Mar 5, 2025

imbajin Mar 5, 2025

imbajin Mar 5, 2025 •

edited

Loading

yanchaomei Mar 5, 2025

imbajin Mar 6, 2025 •

edited

Loading

imbajin Mar 6, 2025 •

edited

Loading

feat(llm):improve some RAG function UT(tests) #192

Are you sure you want to change the base?

feat(llm):improve some RAG function UT(tests) #192

Conversation

yanchaomei commented Mar 5, 2025

Comprehensive Test Suite Implementation for HugeGraph-LLM

Summary of Test Implementation

1. Test Infrastructure

2. Document Processing Tests

3. Integration Tests

4. Middleware Tests

5. Model Tests

6. Operator Tests

Testing Approach

Key Features

Results

imbajin Mar 5, 2025

Choose a reason for hiding this comment

imbajin Mar 5, 2025 • edited Loading

Choose a reason for hiding this comment

yanchaomei Mar 5, 2025

Choose a reason for hiding this comment

imbajin Mar 6, 2025 • edited Loading

Choose a reason for hiding this comment

imbajin Mar 6, 2025 • edited Loading

Choose a reason for hiding this comment

imbajin Mar 5, 2025 •

edited

Loading

imbajin Mar 6, 2025 •

edited

Loading

imbajin Mar 6, 2025 •

edited

Loading