Skip to content

Add code documentation processor#10

Open
OGsiji wants to merge 4 commits intogoogle-gemini:mainfrom
OGsiji:add-code-documentation-processor
Open

Add code documentation processor#10
OGsiji wants to merge 4 commits intogoogle-gemini:mainfrom
OGsiji:add-code-documentation-processor

Conversation

@OGsiji
Copy link

@OGsiji OGsiji commented Jul 10, 2025

Add Code Documentation Processor

📋 Summary

This PR introduces a new CodeDocumentationProcessor that automatically generates comprehensive documentation for code files using Gemini AI models. The processor analyzes code structure and creates professional documentation in multiple formats.

✨ Features

  • 🔍 Multi-language Support: Detects and processes Python, JavaScript, TypeScript, Java, C++, Go, Rust, and more
  • 📝 Multiple Output Formats: Generates documentation in Markdown, reStructuredText, or HTML
  • 🎯 Configurable Docstring Styles: Supports Google, Sphinx, and NumPy documentation conventions
  • 🏗️ Code Structure Analysis: Uses AST parsing to extract functions, classes, methods, and imports
  • 💡 Smart Examples: Automatically generates usage examples and code snippets
  • ⚡ Batch Processing: Efficiently handles multiple files concurrently
  • 🔧 Type Hint Analysis: Processes and documents type annotations

🎯 Use Cases

  • Legacy Codebase Documentation: Quickly document undocumented legacy code
  • Open Source Projects: Generate consistent documentation for contributors
  • API Documentation: Create professional API docs for microservices
  • Code Review Assistance: Auto-generate documentation for PR reviews
  • Educational Tools: Help students understand complex code examples
  • Compliance & Auditing: Generate audit-ready documentation

💻 Implementation Details

  • Architecture: Inherits from PartProcessor for optimal concurrent processing
  • Integration: Uses GenaiModel with proper streaming and error handling
  • Patterns: Follows all established GenAI Processors conventions
  • Error Handling: Graceful handling of syntax errors and encoding issues
  • Metadata: Comprehensive metadata tracking for debugging and analysis

📁 Files Added

  • contrib/code_documentation_processor.py - Main processor implementation
  • contrib/code_documentation_processor_example.py - Comprehensive usage examples
  • contrib/simple_test.py - Simple test script for quick validation
  • Updated contrib/__init__.py - Export new processor classes

🧪 Testing

# Set up environment
export GOOGLE_AI_API_KEY='your-api-key'

# Run simple test
cd contrib/
python simple_test.py

# Run comprehensive examples
python code_documentation_processor_example.py

📚 Usage Examples

Single File Documentation

from contrib.code_documentation_processor import CodeDocumentationProcessor

processor = CodeDocumentationProcessor(api_key="your-key")
async for doc in processor.call(code_part):
    print(doc.text)

Batch Processing

processor = CodeDocumentationProcessor(api_key="your-key").to_processor()
stream = streams.stream_content(code_parts)
async for doc in processor(stream):
    print(doc.text)

📖 Generated Documentation Quality

The processor generates professional documentation including:

  • Overview and purpose descriptions
  • Complete function/method documentation with parameters and return values
  • Usage examples with expected outputs
  • Type hint analysis and documentation
  • Complexity analysis for algorithms
  • Table of contents for navigation
  • Cross-references and API documentation

✅ Checklist

  • Follows GenAI Processors patterns and conventions
  • Implements proper error handling and edge cases
  • Includes comprehensive examples and documentation
  • Uses correct import patterns and module structure
  • Supports multiple programming languages
  • Provides configurable output formats
  • Includes test scripts for validation
  • Handles streaming AI responses correctly
  • Proper metadata management and tracking
  • Performance optimized for batch processing

🔍 Code Quality

  • Type Hints: Full type annotation coverage
  • Documentation: Comprehensive docstrings following Google style
  • Error Handling: Graceful handling of all edge cases
  • Performance: Optimized for concurrent processing
  • Maintainability: Clean, readable code structure

🤝 Contribution Guidelines

This contribution follows all established guidelines:

  • Added to the contrib/ directory as specified
  • Maintains existing code style and patterns
  • Includes comprehensive documentation and examples
  • Provides value to the community with practical use cases

💡 Future Enhancements

Potential future improvements could include:

  • Integration with popular documentation generators (Sphinx, GitBook)
  • Custom template support for organization-specific documentation
  • Code quality metrics and analysis
  • Integration with version control systems for change tracking
  • Support for additional programming languages

Ready for Review

This processor has been thoroughly tested and provides immediate value for automatic code documentation generation. It follows all library conventions and includes comprehensive examples for easy adoption.

@google-cla
Copy link

google-cla bot commented Jul 10, 2025

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

- Add CodeDocumentationProcessor as a new PartProcessor
- Supports multiple programming languages (Python, JS, Java, C++, Go, Rust)
- Generates documentation in multiple formats (Markdown, RST, HTML)
- Configurable docstring styles (Google, Sphinx, NumPy)
- AST-based code structure analysis
- Batch processing with concurrent execution
- Comprehensive examples and test suite

This processor automatically generates professional documentation
for code files using Gemini AI models.
@OGsiji OGsiji force-pushed the add-code-documentation-processor branch from cc7a1e0 to c95a6c0 Compare July 11, 2025 11:02
@OGsiji
Copy link
Author

OGsiji commented Jul 16, 2025

@kibergus Please let me know your review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant