feat: Add SerpexWebSearch component for multi-engine web search #9937
+483
−1
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Add SerpexWebSearch Component for Multi-Engine Web Search Integration
Description
This PR introduces a new
SerpexWebSearchcomponent to Haystack's fetchers module, enabling seamless integration with the SERPEX API for fetching organic web search results from multiple search engines.What does it do?
The
SerpexWebSearchcomponent:Documentobjects with rich metadata (title, URL, position, snippet)to_dict/from_dict)Why is it needed?
Web search is a critical capability for RAG (Retrieval-Augmented Generation) pipelines and AI applications that need to ground responses with current, up-to-date information. This component:
Changes Made
New Files Added
haystack/components/fetchers/serpex.py(203 lines)SerpexWebSearchcomponent class decorated with@componentrun()method returningList[Document]to_dict()andfrom_dict()for serializationtest/components/fetchers/test_serpex.py(280 lines)SERPEX_API_KEYenvironment variable)releasenotes/notes/add-serpex-web-search-fetcher-a1b2c3d4e5f6g7h8.yamlModified Files
haystack/components/fetchers/__init__.pySerpexWebSearchto exports_import_structuredictionaryHow did you test it?
Unit Tests
Results:
Integration Tests
Tested with real SERPEX API using provided API key:
Test Scenarios - All Passing ✅
Basic Google Search
Haystack Framework Search
Multi-Engine Support (DuckDuckGo)
Time Range Filtering
Technical Query
Manual Verification
✅ API Endpoint:
https://api.serpex.dev/api/search✅ Authentication: Bearer token correctly formatted
✅ Response Parsing: Correctly handles
resultsarray✅ Document Structure: All required metadata fields present
✅ Error Handling: Proper exceptions raised and logged
✅ Resource Cleanup:
__del__method properly closes HTTP clientCode Quality Checks
✅ All checks passing
✅ No syntax errors
✅ Type hints complete
✅ Code style compliant
Implementation Details
API Integration
Pipeline Integration
Component Parameters
Initialization:
api_key(str, required): SERPEX API key from https://serpex.devengine(str, optional): Default search engine - "auto", "google", "bing", "duckduckgo", "brave", "yahoo", "yandex" (default: "google")num_results(int, optional): Number of results (default: 10)timeout(int, optional): Request timeout in seconds (default: 10)retry_attempts(int, optional): Retry attempts for failed requests (default: 2)Run Method:
query(str, required): Search queryengine(str, optional): Override default enginenum_results(int, optional): Override result counttime_range(str, optional): Filter by time - "all", "day", "week", "month", "year"Output:
Dict[str, List[Document]]with key "documents"Notes for the Reviewer
Architecture
The component follows Haystack's established patterns:
@componentdecorator for framework integrationto_dict()/from_dict()for serialization@component.output_types()for output specificationDependency Analysis
No new external dependencies added:
Testing Coverage
Performance Considerations
Security
Backwards Compatibility
✅ No breaking changes to existing Haystack APIs
✅ New component is additive only
✅ Follows existing fetcher patterns (LinkContentFetcher)
Checklist
feat:addedRelated Issues
Enables web search integration requested in community for RAG pipeline support.
Commits
feat: Add SerpexWebSearch component for multi-engine web search (b74f358)
fix: Correct SERPEX API response field names (fab6ed4)
resultsinstead oforganic_results,urlinstead oflink)Screenshots / Demo
Test Results
Example Output
Ready for merge! ✅ All tests passing, fully documented, production-ready.