Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: improve tasks code structure and efficiency #280

Draft
wants to merge 17 commits into
base: release/2.1.1
Choose a base branch
from

Conversation

psyray
Copy link
Contributor

@psyray psyray commented Feb 28, 2025

This pull request introduces several refactoring improvements to enhance code structure, efficiency, and security. Key changes include reorganizing utility functions, optimizing string building and file handling, and using parameterized commands for enhanced security. Additionally, Celery worker management and task execution have been streamlined for better performance and debugging. Minor updates to form handling and conditional checks further improve code clarity and maintainability.

To help track Celery task in debug mode
This commit introduces significant improvements to the scan task management and execution within reNgine-ng. Key changes include refactoring command execution for better streaming and error handling, optimizing Celery task queues for specific scan types. These changes aim to improve performance, reliability, and resource management during scans. Additionally, the commit refines logging and debugging capabilities, and streamlines certain code sections for better maintainability.
This change improves the logic of the directory scan. The update  associates discovered endpoints with the correct subdomain for more accurate results. Additionally, debugpy now waits for a client with a timeout to prevent indefinite blocking. Warnings are also now suppressed to reduce clutter in the logs.
This commit introduces several refactoring improvements to enhance code structure, logging, and functionality:

- Modularization: Common functions related to database operations, logging, and utilities are moved to dedicated modules within reNgine.utils. This improves code organization and maintainability.
- Logging Enhancement: The logging system is improved by introducing a custom Logger class for better control and formatting of log messages.
- Command Execution: The run_command function is replaced with run_command_line, which provides better handling of command execution
- Code Cleanup: Several code sections are simplified and cleaned up for better readability and efficiency. For example, string concatenation in loops is replaced with more efficient join operations.
- Bug Fixes: Minor bug fixes and improvements are included, such as handling missing keys in dictionaries and correcting conditional checks.
- Testing Improvements: Adjusted tests to reflect the refactoring changes and ensure continued functionality.
- API Changes: Minor changes to API endpoints and serializers for consistency and clarity. Removed unused serializers and updated API documentation.
- UI/UX Improvements: Minor updates to UI elements and forms for better user experience. Simplified form handling and improved error messages.
This commit introduces several refactoring improvements to enhance code structure, logging, and functionality:

- Modularization: Common functions related to database operations, logging, and utilities are moved to dedicated modules within reNgine.utils. This improves code organization and maintainability.
- Logging Enhancement: The logging system is improved by introducing a custom Logger class for better control and formatting of log messages.
- Command Execution: The run_command function is replaced with run_command_line, which provides better handling of command execution
- Code Cleanup: Several code sections are simplified and cleaned up for better readability and efficiency. For example, string concatenation in loops is replaced with more efficient join operations.
- Bug Fixes: Minor bug fixes and improvements are included, such as handling missing keys in dictionaries and correcting conditional checks.
- Testing Improvements: Adjusted tests to reflect the refactoring changes and ensure continued functionality.
- API Changes: Minor changes to API endpoints and serializers for consistency and clarity. Removed unused serializers and updated API documentation.
- UI/UX Improvements: Minor updates to UI elements and forms for better user experience. Simplified form handling and improved error messages.
Improved the handling of task configurations by allowing YAML configuration to be passed as a string or dictionary, and refactored custom header handling for better organization and flexibility. Simplified some error handling and debug logging. Updated several tasks to use the new configuration methods. Minor updates were also made to amass and OneForAll command building and shell usage in port scanning.
Updated config retrieval throughout the codebase to use the more versatile get() method instead of get_value(). Additionally, minor improvements were made to directory/file fuzzing, S3 scanner configuration, and debug functionality.
This commit enhances logging messages with descriptive emojis and prefixes, improves error handling in task execution, and fixes a bug in CMS detection. Additionally, it removes unnecessary logging of available scan engines and cleans up temporary files more reliably. Finally, it ensures that input files exist before processing.
@psyray psyray added the bug Something isn't working label Feb 28, 2025
@psyray psyray self-assigned this Feb 28, 2025
Copy link

@github-advanced-security github-advanced-security bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CodeQL found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.

Copy link
Contributor

sourcery-ai bot commented Feb 28, 2025

Reviewer's Guide by Sourcery

This pull request refactors the codebase to improve code structure, efficiency, and security. It includes reorganizing utility functions, optimizing string building and file handling, using parameterized commands for enhanced security, and streamlining Celery worker management and task execution. Minor updates to form handling and conditional checks further improve code clarity and maintainability.

Sequence diagram for deleting a scan

sequenceDiagram
    participant User
    participant View
    participant ScanHistory
    participant CommandBuilder
    participant CeleryTask

    User->>View: Initiates delete scan
    View->>ScanHistory: Retrieves ScanHistory object
    View->>CommandBuilder: Creates CommandBuilder with 'rm'
    CommandBuilder->>CommandBuilder: Adds '-rf' option with directory
    View->>CeleryTask: run_command_line(command)
    CeleryTask->>ScanHistory: Deletes ScanHistory object
    ScanHistory-->>View: Success message
    View-->>User: Displays success message
Loading

Updated class diagram for CommandBuilder

classDiagram
    class CommandBuilder {
        -command: string
        -options: list
        +CommandBuilder(command: string)
        +add_option(option: string, value: string, condition: bool)
        +add_raw_option(option: string)
        +add_pipe_command(pipe_command: string)
        +add_redirection(symbol: string, file: string)
        +build_list(): list
        +build_string(): string
    }

    note for CommandBuilder "This class is used to build commands in a secure way, avoiding shell injection vulnerabilities."
Loading

File-Level Changes

Change Details Files
Refactored utility functions and database interactions for improved code organization and maintainability.
  • Moved utility functions related to database operations to reNgine.utils.db.
  • Moved logging functionality to reNgine.utils.logger.
  • Moved general utility functions to reNgine.utils.utils.
  • Introduced reNgine.utils.command_builder for constructing commands.
  • Replaced direct command execution with run_command_line from reNgine.tasks.command.
web/startScan/views.py
web/reNgine/celery_custom_task.py
web/reNgine/settings.py
web/api/views.py
web/targetApp/views.py
web/api/serializers.py
Enhanced security by using parameterized commands instead of direct string formatting to prevent command injection vulnerabilities.
  • Replaced string formatting with CommandBuilder in delete_scan view.
  • Replaced string formatting with CommandBuilder in get_and_save_dork_results function.
  • Replaced string formatting with CommandBuilder in build_httpx_command function.
  • Replaced string formatting with CommandBuilder in build_url_fetch_commands function.
  • Replaced string formatting with CommandBuilder in build_amass_passive_command function.
  • Replaced string formatting with CommandBuilder in build_amass_active_command function.
  • Replaced string formatting with CommandBuilder in build_sublist3r_command function.
  • Replaced string formatting with CommandBuilder in build_subfinder_command function.
  • Replaced string formatting with CommandBuilder in build_tlsx_command function.
  • Replaced string formatting with CommandBuilder in build_netlas_command function.
  • Replaced string formatting with CommandBuilder in get_nmap_cmd function.
web/startScan/views.py
web/reNgine/utils/db.py
web/reNgine/utils/http.py
web/reNgine/utils/subdomain_tools.py
web/reNgine/utils/command_builder.py
Optimized string building and file handling for improved efficiency.
  • Replaced manual string concatenation with join for building the response_body in export_endpoints view.
  • Used contextlib.suppress to handle KeyError exceptions in api_vault_delete view.
  • Used with open(...) to ensure proper file handling in add_wordlist view.
  • Used f-strings for string formatting in add_engine view.
  • Used f-strings for string formatting in delete_target view.
web/startScan/views.py
web/scanEngine/views.py
web/reNgine/utils/db.py
web/targetApp/views.py
Streamlined Celery worker management and task execution for better performance and debugging.
  • Updated Celery configuration to use multiple queues and workers.
  • Added CELERY_DEBUG and CELERY_REMOTE_DEBUG environment variables for debugging.
  • Modified entrypoint.sh to start multiple Celery workers with different configurations.
  • Added RengineTaskFormatter to improve Celery logging.
  • Added get_from_cache to improve Celery task performance.
docker/celery/entrypoint.sh
web/reNgine/celery_custom_task.py
web/reNgine/settings.py
web/reNgine/celery.py
docker/celery/Dockerfile
Improved code clarity and maintainability through minor updates to form handling and conditional checks.
  • Simplified conditional checks in start_multiple_scan view by using in operator.
  • Reordered conditional checks in stop_scan view to improve readability.
  • Simplified conditional checks in create_report view by using in operator.
  • Removed unused imports and variables.
web/startScan/views.py
web/scanEngine/views.py
web/reNgine/celery_custom_task.py
web/api/views.py

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!
  • Generate a plan of action for an issue: Comment @sourcery-ai plan on
    an issue to generate a plan of action for it.

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

psyray added 3 commits March 1, 2025 04:33
This change enhances the vulnerability scanning pipeline with improved logging, more informative messages, and better task management. It also refactors command execution for increased security and flexibility, and updates the task queue for nuclei_scan to io_queue. Additionally, it includes minor updates to port scanning and task dependencies.
This change introduces a new feature to visualize the scan workflow. It generates a text-based representation of the engine's scan process, including task dependencies and parallelism, using emojis and a tree-like structure. This visualization is displayed in the logs when a scan is initiated. Additionally, minor logging improvements were made in other tasks for better clarity.
This change enhances the command execution module with improved JSON handling, streaming output support, and more robust dry run capabilities. It also refines the command builder for greater flexibility and updates task logging for better clarity. Additionally, it fixes a model import issue in the notification utility and ensures unsupported subdomain tools are handled gracefully.
@psyray psyray changed the title refactor: improve code structure and efficiency refactor: improve tasks code structure and efficiency Mar 2, 2025
psyray added 5 commits March 3, 2025 20:11
This commit refactors the command building process for various security tools, introduces a new CommandBuilder class, and enhances the mocking utilities for dry run testing. The changes improve code organization, security, and testability. Specifically, the command building logic is now more centralized and uses a safer approach for constructing commands, reducing the risk of command injection vulnerabilities. The mocking utilities are expanded to cover more tools and provide more realistic mock data, improving the effectiveness of dry run testing. Additionally, some unused imports and minor code style issues are addressed.
This commit introduces several improvements to logging, security, and dependencies:
- Enhanced Logging: Improved color handling in logs, making them more readable and informative. Logs now include task names and IDs for better tracking.
- Security Enhancements: The Netlas API key is now handled more securely using ephemeral environment variables.
- Dependency Updates: Updated Python to 3.10.16 and several Python packages to their latest versions. Install flower with poetry and added colorama.
- Standardized JSON Serialization: Implemented a utility function for consistent JSON serialization across the project.
- Minor Refactoring: Updated YAML configuration and default scan engine settings for consistency and clarity. Improved handling of null values in JSON serialization. Simplified command building with a new set_env function.  Corrected a few minor issues in scan and target views.
- Dockerfile Improvements: Updated the Python installation process in the Dockerfile for efficiency. Removed the flower dependency installation via pipx.
This commit refactors the logging system and task categorization within the application. The changes improve code organization, readability, and provide more context in log messages. Specifically, the ANSI color codes are moved to a dedicated Colors class, and task logging now includes color-coded task categories for better visual distinction. Additionally, several log messages have been adjusted to provide more relevant information and use more appropriate log levels. Finally, the docker-compose file is updated to improve container behavior.
This commit introduces a DRY_RUN mode to reNgine-ng, allowing users to simulate scans and task execution without making actual changes or sending real requests. This is achieved by generating mock data for various tasks, enabling users to test workflows and configurations safely. The implementation includes a new get_mock_for_task function in utils/mock.py to handle mock data creation for different tasks, and modifications to the RengineTask in celery_custom_task.py to manage DRY_RUN mode execution. Several utility functions for generating mock data for specific tasks like subdomain discovery, URL fetching, OSINT, screenshots, WAF detection, directory fuzzing, and vulnerability scanning have also been added. Additionally, the nmap and scan_http_ports tasks have been updated to support DRY_RUN mode and use the new mock data. Finally, the delete_scan and delete_all_screenshots views have been improved to handle cases where results directories do not exist.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant