refactor: improve tasks code structure and efficiency #280

psyray · 2025-02-28T17:47:38Z

This pull request introduces several refactoring improvements to enhance code structure, efficiency, and security. Key changes include reorganizing utility functions, optimizing string building and file handling, and using parameterized commands for enhanced security. Additionally, Celery worker management and task execution have been streamlined for better performance and debugging. Minor updates to form handling and conditional checks further improve code clarity and maintainability.

To help track Celery task in debug mode

This commit introduces significant improvements to the scan task management and execution within reNgine-ng. Key changes include refactoring command execution for better streaming and error handling, optimizing Celery task queues for specific scan types. These changes aim to improve performance, reliability, and resource management during scans. Additionally, the commit refines logging and debugging capabilities, and streamlines certain code sections for better maintainability.

This change improves the logic of the directory scan. The update associates discovered endpoints with the correct subdomain for more accurate results. Additionally, debugpy now waits for a client with a timeout to prevent indefinite blocking. Warnings are also now suppressed to reduce clutter in the logs.

This commit introduces several refactoring improvements to enhance code structure, logging, and functionality: - Modularization: Common functions related to database operations, logging, and utilities are moved to dedicated modules within reNgine.utils. This improves code organization and maintainability. - Logging Enhancement: The logging system is improved by introducing a custom Logger class for better control and formatting of log messages. - Command Execution: The run_command function is replaced with run_command_line, which provides better handling of command execution - Code Cleanup: Several code sections are simplified and cleaned up for better readability and efficiency. For example, string concatenation in loops is replaced with more efficient join operations. - Bug Fixes: Minor bug fixes and improvements are included, such as handling missing keys in dictionaries and correcting conditional checks. - Testing Improvements: Adjusted tests to reflect the refactoring changes and ensure continued functionality. - API Changes: Minor changes to API endpoints and serializers for consistency and clarity. Removed unused serializers and updated API documentation. - UI/UX Improvements: Minor updates to UI elements and forms for better user experience. Simplified form handling and improved error messages.

Improved the handling of task configurations by allowing YAML configuration to be passed as a string or dictionary, and refactored custom header handling for better organization and flexibility. Simplified some error handling and debug logging. Updated several tasks to use the new configuration methods. Minor updates were also made to amass and OneForAll command building and shell usage in port scanning.

Updated config retrieval throughout the codebase to use the more versatile get() method instead of get_value(). Additionally, minor improvements were made to directory/file fuzzing, S3 scanner configuration, and debug functionality.

This commit enhances logging messages with descriptive emojis and prefixes, improves error handling in task execution, and fixes a bug in CMS detection. Additionally, it removes unnecessary logging of available scan engines and cleans up temporary files more reliably. Finally, it ensures that input files exist before processing.

github-advanced-security

CodeQL found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.

sourcery-ai · 2025-02-28T17:47:42Z

Reviewer's Guide by Sourcery

This pull request refactors the codebase to improve code structure, efficiency, and security. It includes reorganizing utility functions, optimizing string building and file handling, using parameterized commands for enhanced security, and streamlining Celery worker management and task execution. Minor updates to form handling and conditional checks further improve code clarity and maintainability.

Sequence diagram for deleting a scan

sequenceDiagram
    participant User
    participant View
    participant ScanHistory
    participant CommandBuilder
    participant CeleryTask

    User->>View: Initiates delete scan
    View->>ScanHistory: Retrieves ScanHistory object
    View->>CommandBuilder: Creates CommandBuilder with 'rm'
    CommandBuilder->>CommandBuilder: Adds '-rf' option with directory
    View->>CeleryTask: run_command_line(command)
    CeleryTask->>ScanHistory: Deletes ScanHistory object
    ScanHistory-->>View: Success message
    View-->>User: Displays success message

Updated class diagram for CommandBuilder

classDiagram
    class CommandBuilder {
        -command: string
        -options: list
        +CommandBuilder(command: string)
        +add_option(option: string, value: string, condition: bool)
        +add_raw_option(option: string)
        +add_pipe_command(pipe_command: string)
        +add_redirection(symbol: string, file: string)
        +build_list(): list
        +build_string(): string
    }

    note for CommandBuilder "This class is used to build commands in a secure way, avoiding shell injection vulnerabilities."

File-Level Changes

Change	Details	Files
Refactored utility functions and database interactions for improved code organization and maintainability.	Moved utility functions related to database operations to `reNgine.utils.db`. Moved logging functionality to `reNgine.utils.logger`. Moved general utility functions to `reNgine.utils.utils`. Introduced `reNgine.utils.command_builder` for constructing commands. Replaced direct command execution with `run_command_line` from `reNgine.tasks.command`.	`web/startScan/views.py` `web/reNgine/celery_custom_task.py` `web/reNgine/settings.py` `web/api/views.py` `web/targetApp/views.py` `web/api/serializers.py`
Enhanced security by using parameterized commands instead of direct string formatting to prevent command injection vulnerabilities.	Replaced string formatting with `CommandBuilder` in `delete_scan` view. Replaced string formatting with `CommandBuilder` in `get_and_save_dork_results` function. Replaced string formatting with `CommandBuilder` in `build_httpx_command` function. Replaced string formatting with `CommandBuilder` in `build_url_fetch_commands` function. Replaced string formatting with `CommandBuilder` in `build_amass_passive_command` function. Replaced string formatting with `CommandBuilder` in `build_amass_active_command` function. Replaced string formatting with `CommandBuilder` in `build_sublist3r_command` function. Replaced string formatting with `CommandBuilder` in `build_subfinder_command` function. Replaced string formatting with `CommandBuilder` in `build_tlsx_command` function. Replaced string formatting with `CommandBuilder` in `build_netlas_command` function. Replaced string formatting with `CommandBuilder` in `get_nmap_cmd` function.	`web/startScan/views.py` `web/reNgine/utils/db.py` `web/reNgine/utils/http.py` `web/reNgine/utils/subdomain_tools.py` `web/reNgine/utils/command_builder.py`
Optimized string building and file handling for improved efficiency.	Replaced manual string concatenation with `join` for building the `response_body` in `export_endpoints` view. Used contextlib.suppress to handle KeyError exceptions in `api_vault_delete` view. Used `with open(...)` to ensure proper file handling in `add_wordlist` view. Used f-strings for string formatting in `add_engine` view. Used f-strings for string formatting in `delete_target` view.	`web/startScan/views.py` `web/scanEngine/views.py` `web/reNgine/utils/db.py` `web/targetApp/views.py`
Streamlined Celery worker management and task execution for better performance and debugging.	Updated Celery configuration to use multiple queues and workers. Added `CELERY_DEBUG` and `CELERY_REMOTE_DEBUG` environment variables for debugging. Modified `entrypoint.sh` to start multiple Celery workers with different configurations. Added `RengineTaskFormatter` to improve Celery logging. Added `get_from_cache` to improve Celery task performance.	`docker/celery/entrypoint.sh` `web/reNgine/celery_custom_task.py` `web/reNgine/settings.py` `web/reNgine/celery.py` `docker/celery/Dockerfile`
Improved code clarity and maintainability through minor updates to form handling and conditional checks.	Simplified conditional checks in `start_multiple_scan` view by using `in` operator. Reordered conditional checks in `stop_scan` view to improve readability. Simplified conditional checks in `create_report` view by using `in` operator. Removed unused imports and variables.	`web/startScan/views.py` `web/scanEngine/views.py` `web/reNgine/celery_custom_task.py` `web/api/views.py`

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time. You can also comment
@sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time exactly where you
want it. You can also comment @sourcery-ai summary on the pull request to
(re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull
request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
pull request to resolve all Sourcery comments. Useful if you've already
addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
request to dismiss all existing Sourcery reviews. Especially useful if you
want to start fresh with a new review - don't forget to comment
@sourcery-ai review to trigger a new review!
Generate a plan of action for an issue: Comment @sourcery-ai plan on
an issue to generate a plan of action for it.

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

This change enhances the vulnerability scanning pipeline with improved logging, more informative messages, and better task management. It also refactors command execution for increased security and flexibility, and updates the task queue for nuclei_scan to io_queue. Additionally, it includes minor updates to port scanning and task dependencies.

This change introduces a new feature to visualize the scan workflow. It generates a text-based representation of the engine's scan process, including task dependencies and parallelism, using emojis and a tree-like structure. This visualization is displayed in the logs when a scan is initiated. Additionally, minor logging improvements were made in other tasks for better clarity.

This change enhances the command execution module with improved JSON handling, streaming output support, and more robust dry run capabilities. It also refines the command builder for greater flexibility and updates task logging for better clarity. Additionally, it fixes a model import issue in the notification utility and ensures unsupported subdomain tools are handled gracefully.

This commit refactors the command building process for various security tools, introduces a new CommandBuilder class, and enhances the mocking utilities for dry run testing. The changes improve code organization, security, and testability. Specifically, the command building logic is now more centralized and uses a safer approach for constructing commands, reducing the risk of command injection vulnerabilities. The mocking utilities are expanded to cover more tools and provide more realistic mock data, improving the effectiveness of dry run testing. Additionally, some unused imports and minor code style issues are addressed.

This commit introduces several improvements to logging, security, and dependencies: - Enhanced Logging: Improved color handling in logs, making them more readable and informative. Logs now include task names and IDs for better tracking. - Security Enhancements: The Netlas API key is now handled more securely using ephemeral environment variables. - Dependency Updates: Updated Python to 3.10.16 and several Python packages to their latest versions. Install flower with poetry and added colorama. - Standardized JSON Serialization: Implemented a utility function for consistent JSON serialization across the project. - Minor Refactoring: Updated YAML configuration and default scan engine settings for consistency and clarity. Improved handling of null values in JSON serialization. Simplified command building with a new set_env function. Corrected a few minor issues in scan and target views. - Dockerfile Improvements: Updated the Python installation process in the Dockerfile for efficiency. Removed the flower dependency installation via pipx.

This commit refactors the logging system and task categorization within the application. The changes improve code organization, readability, and provide more context in log messages. Specifically, the ANSI color codes are moved to a dedicated Colors class, and task logging now includes color-coded task categories for better visual distinction. Additionally, several log messages have been adjusted to provide more relevant information and use more appropriate log levels. Finally, the docker-compose file is updated to improve container behavior.

This commit introduces a DRY_RUN mode to reNgine-ng, allowing users to simulate scans and task execution without making actual changes or sending real requests. This is achieved by generating mock data for various tasks, enabling users to test workflows and configurations safely. The implementation includes a new get_mock_for_task function in utils/mock.py to handle mock data creation for different tasks, and modifications to the RengineTask in celery_custom_task.py to manage DRY_RUN mode execution. Several utility functions for generating mock data for specific tasks like subdomain discovery, URL fetching, OSINT, screenshots, WAF detection, directory fuzzing, and vulnerability scanning have also been added. Additionally, the nmap and scan_http_ports tasks have been updated to support DRY_RUN mode and use the new mock data. Finally, the delete_scan and delete_all_screenshots views have been improved to handle cases where results directories do not exist.

psyray added 9 commits February 22, 2025 17:36

fix: add celery flower & refactor celery worker launch

4a1f830

To help track Celery task in debug mode

fix: add http_crawl to running tasks exception list

dfe2375

psyray added the bug Something isn't working label Feb 28, 2025

psyray self-assigned this Feb 28, 2025

github-advanced-security bot found potential problems Feb 28, 2025

View reviewed changes

psyray added 3 commits March 1, 2025 04:33

psyray changed the title ~~refactor: improve code structure and efficiency~~ refactor: improve tasks code structure and efficiency Mar 2, 2025

psyray added 5 commits March 3, 2025 20:11

Merge branch 'release/2.1.1' into fix/277-nuclei-scan

a327ef0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: improve tasks code structure and efficiency #280

refactor: improve tasks code structure and efficiency #280

psyray commented Feb 28, 2025 •

edited

Loading

github-advanced-security bot left a comment

sourcery-ai bot commented Feb 28, 2025 •

edited

Loading

Interacting with Sourcery

Customizing Your Experience

Getting Help

refactor: improve tasks code structure and efficiency #280

Are you sure you want to change the base?

refactor: improve tasks code structure and efficiency #280

Conversation

psyray commented Feb 28, 2025 • edited Loading

github-advanced-security bot left a comment

Choose a reason for hiding this comment

sourcery-ai bot commented Feb 28, 2025 • edited Loading

Reviewer's Guide by Sourcery

Sequence diagram for deleting a scan

Updated class diagram for CommandBuilder

File-Level Changes

Interacting with Sourcery

Customizing Your Experience

Getting Help

psyray commented Feb 28, 2025 •

edited

Loading

sourcery-ai bot commented Feb 28, 2025 •

edited

Loading