Skip to content

Conversation

@thicham43
Copy link
Contributor

@thicham43 thicham43 commented Jul 31, 2025

Summary by CodeRabbit

  • New Features

    • Introduced asynchronous, task-based export workflows for improved reliability and scalability.
    • Added detailed notification emails for export events, including success and failure scenarios.
    • Enhanced validation for export table configurations and datalab names.
  • Bug Fixes

    • Improved error handling and reporting for export and database operations.
    • Strengthened validation to prevent invalid datalab names containing spaces or hyphens.
  • Refactor

    • Simplified and modularized export logic, moving to Celery task chains for better maintainability.
    • Streamlined authorization handling across cohort and export services.
    • Replaced synchronous export handling with explicit job tracking and custom exceptions in exporters.
  • Tests

    • Added comprehensive tests for export task workflows and retry logic.
    • Updated and removed obsolete tests to align with new asynchronous export processing.
  • Chores

    • Cleaned up unused code, removed deprecated notification and export management modules.

@coderabbitai
Copy link

coderabbitai bot commented Jul 31, 2025

Walkthrough

This update introduces a major refactor and modularization of the export workflow, transitioning from synchronous, inline logic to a robust, asynchronous Celery task chain architecture. Export table creation, cohort subset management, export launching, job tracking, and notification handling are now implemented as discrete, reusable Celery tasks. The export service and exporters are refactored for clearer validation, error handling, and extensibility. Related tests are updated, removed, or added to align with the new workflow, and supporting modules are introduced for common tools and exceptions.

Changes

Cohort / File(s) Change Summary
Cohort Service Refactor
cohort/services/cohort_result.py, cohort/services/dated_measure.py, cohort/services/feasibility_study.py, cohort/views/cohort_result.py, cohort/views/dated_measure.py, cohort/views/feasibility_study.py, cohort/tests/test_service_cohort_result.py, cohort/tests/tests_view_cohort_result.py, cohort/tests/tests_view_dated_measure.py
Refactored service and view methods to pass explicit auth_headers dictionaries instead of HTTP request objects, updated method signatures, and removed dependencies on get_authorization_header in tests and logic.
Exporters Core Refactor
exporters/exporters/base_exporter.py, exporters/exporters/csv_exporter.py, exporters/exporters/hive_exporter.py, exporters/exporters/xlsx_exporter.py, exporters/enums.py, exporters/exceptions.py, exporters/notifications.py
Simplified and modularized exporter classes, removed notification logic and synchronous job handling from exporters, introduced custom exceptions, and updated enum usage and notification formatting.
Exporters Task Removal
exporters/tasks.py, exporters/tests/test_csv_exporter.py, exporters/tests/test_hive_exporter.py, exporters/tests/test_xlsx_exporter.py, exporters/tests/test_notifying_tasks.py
Deleted obsolete notification Celery tasks and related tests, removed notification logic from exporters, updated tests to match new workflow and error handling.
Export Workflow Modularization
exports/services/export.py, exports/services/export_operators.py, exports/tasks.py, exports/models/datalab.py, exports/views/export.py
Replaced synchronous export logic with Celery task chain orchestration, removed dynamic exporter loader, updated datalab name validation, and adapted views for new service and auth pattern.
Export Tasks Implementation
exports/export_tasks/__init__.py, exports/export_tasks/await_cohorts.py, exports/export_tasks/create_cohorts.py, exports/export_tasks/create_tables.py, exports/export_tasks/finalize.py, exports/export_tasks/notify_failed.py, exports/export_tasks/notify_received.py, exports/export_tasks/notify_succeeded.py, exports/export_tasks/prep.py, exports/export_tasks/send.py, exports/export_tasks/track.py
Introduced new Celery tasks for each export workflow step: notification, cohort subset creation, table creation, waiting, preparation, sending, tracking, finalizing, and handling failures.
Export Tools
exports/tools.py
Added memoized utility functions for export and cohort retrieval.
Exporters Loader and Service Tests
exports/tests/test_exporters_loader.py, exports/tests/test_export_manager.py, exports/tests/test_service_export.py, exports/tests/test_export_tasks.py, exports/tests/test_view_datalab.py, exports/tests/test_view_export.py
Added, updated, or removed tests to cover new loader, export task chain, retry logic, and datalab validation.
Export Table and Retry Tests
exports/tests/test_export_tasks.py, exports/tests/test_view_export.py
Added comprehensive tests for export task orchestration, retry logic, and permission enforcement.

Sequence Diagram(s)

High-Level Export Workflow (New)

sequenceDiagram
    participant User
    participant API/View
    participant ExportService
    participant Celery
    participant Exporter
    participant DB

    User->>API/View: Create Export Request
    API/View->>ExportService: validate(data)
    API/View->>Celery: on_commit(proceed_with_export)
    Celery->>ExportService: proceed_with_export(export, tables, auth_headers)
    ExportService->>Celery: Chain of Tasks (notify, create cohorts, create tables, await, prep, send, track, finalize, notify success/failure)
    loop Export Task Chain
        Celery->>DB: Read/Write Export, Cohort, Tables
        Celery->>Exporter: prepare/send/track/finalize
        Exporter->>DB: Update Export Status
    end
    Celery->>API/View: Completion/Failure Callback (async)
Loading

Export Retry Flow

sequenceDiagram
    participant User
    participant API/View
    participant ExportService
    participant Celery

    User->>API/View: Retry Export (as Admin)
    API/View->>ExportService: retry(export, auth_headers)
    ExportService->>Celery: Chain of Tasks (restore tables, relaunch cohorts if needed, continue export chain)
    Celery->>ExportService: proceed_with_export (as above)
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~90+ minutes

Poem

A rabbit hopped through fields of code,
Where exports once synchronously strode.
Now Celery tasks in chains align,
Notifications, cohorts, all in line.
With enums neat and errors new,
Async exports swiftly grew!
🥕✨

Note

⚡️ Unit Test Generation is now available in beta!

Learn more here, or try it out under "Finishing Touches" below.


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b7dceb2 and e2e0a26.

📒 Files selected for processing (44)
  • cohort/services/cohort_result.py (5 hunks)
  • cohort/services/dated_measure.py (2 hunks)
  • cohort/services/feasibility_study.py (2 hunks)
  • cohort/tests/test_service_cohort_result.py (1 hunks)
  • cohort/tests/tests_view_cohort_result.py (1 hunks)
  • cohort/tests/tests_view_dated_measure.py (1 hunks)
  • cohort/views/cohort_result.py (2 hunks)
  • cohort/views/dated_measure.py (2 hunks)
  • cohort/views/feasibility_study.py (2 hunks)
  • exporters/enums.py (2 hunks)
  • exporters/exceptions.py (1 hunks)
  • exporters/exporters/base_exporter.py (3 hunks)
  • exporters/exporters/csv_exporter.py (1 hunks)
  • exporters/exporters/hive_exporter.py (3 hunks)
  • exporters/exporters/xlsx_exporter.py (1 hunks)
  • exporters/notifications.py (4 hunks)
  • exporters/tasks.py (0 hunks)
  • exporters/tests/test_csv_exporter.py (0 hunks)
  • exporters/tests/test_hive_exporter.py (3 hunks)
  • exporters/tests/test_notifying_tasks.py (1 hunks)
  • exporters/tests/test_xlsx_exporter.py (0 hunks)
  • exports/export_tasks/__init__.py (1 hunks)
  • exports/export_tasks/await_cohorts.py (1 hunks)
  • exports/export_tasks/create_cohorts.py (1 hunks)
  • exports/export_tasks/create_tables.py (1 hunks)
  • exports/export_tasks/finalize.py (1 hunks)
  • exports/export_tasks/notify_failed.py (1 hunks)
  • exports/export_tasks/notify_received.py (1 hunks)
  • exports/export_tasks/notify_succeeded.py (1 hunks)
  • exports/export_tasks/prep.py (1 hunks)
  • exports/export_tasks/send.py (1 hunks)
  • exports/export_tasks/track.py (1 hunks)
  • exports/models/datalab.py (1 hunks)
  • exports/services/export.py (1 hunks)
  • exports/services/export_operators.py (0 hunks)
  • exports/tasks.py (1 hunks)
  • exports/tests/test_export_manager.py (0 hunks)
  • exports/tests/test_export_tasks.py (1 hunks)
  • exports/tests/test_exporters_loader.py (1 hunks)
  • exports/tests/test_service_export.py (0 hunks)
  • exports/tests/test_view_datalab.py (1 hunks)
  • exports/tests/test_view_export.py (1 hunks)
  • exports/tools.py (1 hunks)
  • exports/views/export.py (3 hunks)
💤 Files with no reviewable changes (6)
  • exporters/tests/test_csv_exporter.py
  • exporters/tasks.py
  • exports/tests/test_service_export.py
  • exporters/tests/test_xlsx_exporter.py
  • exports/services/export_operators.py
  • exports/tests/test_export_manager.py
✅ Files skipped from review due to trivial changes (4)
  • exporters/exporters/csv_exporter.py
  • exports/tasks.py
  • exports/export_tasks/init.py
  • exports/models/datalab.py
🚧 Files skipped from review as they are similar to previous changes (29)
  • exporters/notifications.py
  • cohort/tests/tests_view_cohort_result.py
  • exports/tests/test_exporters_loader.py
  • exports/export_tasks/await_cohorts.py
  • cohort/views/cohort_result.py
  • exports/views/export.py
  • exports/export_tasks/notify_failed.py
  • cohort/services/dated_measure.py
  • cohort/services/cohort_result.py
  • cohort/tests/tests_view_dated_measure.py
  • exporters/exporters/xlsx_exporter.py
  • exports/tests/test_view_datalab.py
  • cohort/views/dated_measure.py
  • cohort/tests/test_service_cohort_result.py
  • exports/tests/test_view_export.py
  • exporters/tests/test_hive_exporter.py
  • cohort/services/feasibility_study.py
  • exports/export_tasks/finalize.py
  • exporters/exceptions.py
  • exports/export_tasks/notify_succeeded.py
  • cohort/views/feasibility_study.py
  • exporters/enums.py
  • exports/tools.py
  • exports/export_tasks/notify_received.py
  • exports/export_tasks/prep.py
  • exports/export_tasks/create_tables.py
  • exports/export_tasks/track.py
  • exports/export_tasks/send.py
  • exports/export_tasks/create_cohorts.py
🧰 Additional context used
🧬 Code Graph Analysis (2)
exporters/exporters/base_exporter.py (3)
exports/models/export.py (1)
  • Export (16-46)
exporters/enums.py (1)
  • APIJobType (25-27)
admin_cohort/types.py (2)
  • JobStatus (35-53)
  • is_end_state (49-53)
exporters/exporters/hive_exporter.py (5)
exporters/exceptions.py (3)
  • InvalidJobId (1-2)
  • CreateHiveDBException (5-6)
  • HiveDBOwnershipException (9-10)
exporters/enums.py (2)
  • ExportTypes (6-22)
  • APIJobType (25-27)
exporters/exporters/base_exporter.py (2)
  • log_export_task (115-116)
  • track_job (92-112)
admin_cohort/types.py (1)
  • JobStatus (35-53)
exports/models/export.py (1)
  • Export (16-46)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: test
🔇 Additional comments (21)
exporters/tests/test_notifying_tasks.py (4)

8-8: LGTM: Updated import path reflects new modular architecture.

The import path change from exporters.tasks to exports.export_tasks aligns with the refactoring to modular Celery task structure.


16-22: Improved test specificity with targeted mocking.

The mock path now targets the specific notification submodule (notify_received) rather than a shared path, which provides better test isolation and reflects the new modular task structure.


24-30: Consistent pattern for success notification testing.

The test follows the same improved pattern as the received notification test, with specific module mocking and updated function calls.


48-51: Updated test function reflects new task naming and functionality.

The test now uses mark_export_as_failed which better describes the task's dual responsibility of handling failure notifications and status updates.

exporters/exporters/hive_exporter.py (4)

9-9: LGTM: Improved error handling with custom exceptions.

The import of specific exceptions (InvalidJobId, CreateHiveDBException, HiveDBOwnershipException) provides better error categorization and handling compared to generic RequestException.


21-21: Good: Direct enum member usage.

Using ExportTypes.HIVE directly instead of ExportTypes.HIVE.value is cleaner and more idiomatic.


66-76: Enhanced database creation with proper job tracking.

The method now:

  • Saves the job ID to the export object
  • Uses explicit job tracking instead of synchronous waiting
  • Raises specific CreateHiveDBException for better error handling

This aligns well with the asynchronous task architecture.


80-83: Improved error handling for database ownership changes.

The method now raises HiveDBOwnershipException instead of generic RequestException, providing clearer error context for the calling code.

exporters/exporters/base_exporter.py (2)

31-33: Good business rule validation added.

The validation ensures that export tables with FHIR filters must have a cohort source, which prevents invalid export configurations. This is a sensible constraint for data filtering operations.


92-112: Simplified job tracking aligns with new architecture.

The method was appropriately simplified:

  • Renamed from wait_for_job to track_job for clarity
  • Longer polling interval (30 seconds) reduces API load
  • No longer updates export status during polling (delegated to tasks)
  • Returns job status without side effects

This decoupling supports the asynchronous task chain approach.

exports/tests/test_export_tasks.py (6)

15-41: Well-structured test setup with comprehensive fixtures.

The setUp method creates all necessary test data including users, exports, cohorts, and filters. The use of CELERY_TASK_ALWAYS_EAGER=True ensures synchronous task execution for testing.


42-82: Comprehensive CSV export workflow test.

The test thoroughly validates the task chain execution:

  • Mocks all external dependencies appropriately
  • Verifies notification calls
  • Confirms cohort subset creation count matches table count
  • Asserts successful completion

The extensive mocking ensures isolated testing of the workflow logic.


84-124: Thorough Hive export workflow validation.

This test properly verifies Hive-specific operations:

  • Database creation (mock_create_db)
  • Database ownership changes (called twice as expected)
  • All standard export workflow steps

The assertion self.assertEqual(mock_change_db_ownership.call_count, 2) indicates the ownership is changed for both creation and finalization phases.


157-200: Excellent failure scenario testing.

The test validates proper failure handling:

  • Sets cohort subset to failed status
  • Verifies export workflow stops at appropriate point
  • Confirms failure notifications are sent (owner + admins)
  • Validates export status and error message updates

This ensures robust error handling in the task chain.


226-248: Good retry logic testing for API-side failures.

The test verifies that retries after API submission:

  • Reset export status and error messages
  • Mark export as retried
  • Skip cohort subset recreation
  • Include proper tasks in the chain

The task name assertions using __module__ and __name__ are robust.


307-318: Validates table restoration during retry.

This test ensures that soft-deleted export tables are restored during retry operations, which is crucial for export data integrity.

exports/services/export.py (5)

37-54: Robust exporter loading with comprehensive error handling.

The dynamic exporter loading includes:

  • Proper configuration validation with ImproperlyConfigured exceptions
  • Enum validation for export types
  • Import string resolution with error handling
  • Fallback warning for misconfigured exporters

This provides a solid foundation for the service.


57-73: Clean service initialization and validation.

The service properly initializes exporters and provides validation through the appropriate exporter instance with proper exception handling.


89-101: Well-structured task chain with clear documentation.

The task chain is properly organized with:

  • Clear documentation about task chaining (.s() vs .si())
  • Logical workflow order
  • Proper error propagation to failure handling task
  • Asynchronous execution

The warning about modification caution is appropriate given the complexity.


122-129: Retry method has similar safety issues.

The retry method has the same assert statement and potential IndexError issues as the main export method.

Apply the same fixes as suggested for the main export method:

         source_cohorts_ids = [t.cohort_result_source_id for t in export.export_tables.all()]
-        assert source_cohorts_ids, "No `cohort_result_source` was provided for all tables"
+        if not source_cohorts_ids:
+            raise ValueError("No source cohorts found for export tables")
         main_cohort = CohortResult.objects.filter(uuid=source_cohorts_ids[0]).only("group_id", "name").first()
+        if not main_cohort:
+            raise ValueError(f"Main cohort not found for ID: {source_cohorts_ids[0]}")

134-163: Comprehensive retry logic with proper failure scenario handling.

The retry method properly handles:

  • Table restoration for soft-deleted records
  • Different failure points (API-side vs pre-API)
  • Conditional cohort subset relaunching
  • Proper task chain construction
  • Export status updates

The logic is well-structured and handles edge cases appropriately.

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch 3065_rearrange_export_tasks

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai generate unit tests to generate unit tests for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 18

🧹 Nitpick comments (15)
exports/tests/test_view_datalab.py (1)

46-52: LGTM! Test correctly validates the new naming constraints.

The test appropriately verifies that datalab creation fails with HTTP 400 for names containing spaces or hyphens, which aligns with the updated model validation.

Consider adding a test case with a valid name to ensure the validation doesn't have false positives:

 def test_create_datalab_with_invalid_names(self):
     create_url = reverse(viewname=self.viewname_list)
     for name in ["name-with-hyphens", "name with spaces"]:
         self.check_test_create_view(request_user=self.datalabs_manager_user,
                                     create_url=create_url,
                                     request_data={"name": name, "infrastructure_provider": self.infra_provider_aphp.uuid},
                                     expected_resp_status=status.HTTP_400_BAD_REQUEST)
+    
+    # Verify valid names still work
+    self.check_test_create_view(request_user=self.datalabs_manager_user,
+                                create_url=create_url,
+                                request_data={"name": "valid_name_123", "infrastructure_provider": self.infra_provider_aphp.uuid},
+                                expected_resp_status=status.HTTP_201_CREATED)
exports/export_tasks/logs.py (1)

9-10: LGTM! Improved error handling with helper function.

Using get_export_by_id instead of direct database query provides better error handling with a descriptive error message for missing exports.

Consider adding logging for consistency with other export tasks:

+import logging
+
+_logger = logging.getLogger(__name__)
+
 @shared_task
 def get_logs(export_id: str) -> dict:
+    _logger.info(f"Getting logs for export {export_id}")
     export = get_export_by_id(export_id)
     return BaseExporter().export_api.get_export_logs(job_id=export.request_job_id)
exporters/exceptions.py (1)

1-10: LGTM: Well-defined custom exceptions.

The custom exception classes are properly implemented and provide better error specificity than generic exceptions. The naming is clear and descriptive.

Consider adding docstrings or additional context attributes in the future to make these exceptions more informative for debugging purposes.

exports/export_tasks/notify_received.py (2)

9-9: Consider using standard logger naming convention.

Using 'info' as the logger name is unconventional. Consider using __name__ or the module name for better logging hierarchy and debugging.

-_logger = logging.getLogger('info')
+_logger = logging.getLogger(__name__)

13-13: Consider making cohort parameters optional.

The cohort_id and cohort_name parameters might not always be available depending on the export context. Consider making them optional with default values.

-def notify_export_received(export_id: str, cohort_id: str, cohort_name: str) -> None:
+def notify_export_received(export_id: str, cohort_id: str = None, cohort_name: str = None) -> None:
exports/export_tasks/create_tables.py (1)

32-32: Remove unnecessary explicit return.

The explicit return None is unnecessary as Python functions return None by default when no return statement is present.

-    return None
exports/export_tasks/notify_succeeded.py (3)

24-25: Use more explicit conditional expressions for cohort data.

The current pattern cohort and cohort.group_id or None could be clearer and more explicit.

-                             cohort_id=cohort and cohort.group_id or None,
-                             cohort_name=cohort and cohort.name or None,
+                             cohort_id=cohort.group_id if cohort else None,
+                             cohort_name=cohort.name if cohort else None,

32-33: Consider broader exception handling for email notifications.

Only catching OSError might be too narrow. Email sending could fail due to various reasons including network issues, authentication problems, or template errors.

-    except OSError:
-        _logger.error(f"[Export {export_id}] Error sending export success email")
+    except (OSError, Exception) as e:
+        _logger.error(f"[Export {export_id}] Error sending export success email: {e}")

31-31: Fix logging inconsistency between export.uuid and export_id.

Line 31 uses export.uuid while the rest of the function uses export_id for logging context.

-        _logger.info(f"Export[{export.uuid}] Export success email sent")
+        _logger.info(f"Export[{export_id}] Export success email sent")
exports/export_tasks/send.py (1)

61-61: Set failure_reason to None explicitly on success path.

The function returns failure_reason but it's not explicitly set to None on the success path, relying on the input parameter value.

     _logger.info(f"Export[{export_id}] Request sent, job `{job_id}` is now {JobStatus.pending}")
+    failure_reason = None
     return failure_reason
cohort/tests/tests_view_dated_measure.py (1)

120-124: Remove unused mock_header_called parameter from DMCreateCase.

The mock_header_called parameter is no longer used since the authorization header mocking was removed, but it's still defined in the class constructor and passed in test cases.

 class DMCreateCase(CreateCase):
-    def __init__(self, mock_header_called: bool, mock_count_task_called: bool, mock_cancel_task_called: bool, **kwargs):
+    def __init__(self, mock_count_task_called: bool, mock_cancel_task_called: bool, **kwargs):
         super(DMCreateCase, self).__init__(**kwargs)
-        self.mock_header_called = mock_header_called
         self.mock_count_task_called = mock_count_task_called
         self.mock_cancel_task_called = mock_cancel_task_called

You'll also need to update the test case definitions that pass mock_header_called=True/False.

exports/export_tasks/await_cohorts.py (1)

31-35: Consider using JobStatus.is_end_state property.

The code could leverage the is_end_state property from the JobStatus enum to handle terminal states more comprehensively.

 for c in cohort_subsets:
-    if c.request_job_status == JobStatus.failed:
+    if c.request_job_status.is_end_state:
+        if c.request_job_status == JobStatus.failed:
+            failure_reason = "One or more cohort subsets have failed"
+            _logger.info(f"Export[{export_id}] Aborting export... reason: {failure_reason}")
+            return failure_reason
+        elif c.request_job_status == JobStatus.finished:
+            finished_cohorts += 1
+        # Handle other end states like cancelled/unknown as failures
+        elif c.request_job_status in [JobStatus.cancelled, JobStatus.unknown]:
             failure_reason = "One or more cohort subsets have failed"
             _logger.info(f"Export[{export_id}] Aborting export... reason: {failure_reason}")
             return failure_reason
-    if c.request_job_status == JobStatus.finished:
-        finished_cohorts += 1
exports/export_tasks/notify_failed.py (1)

15-27: Good notification logic with minor improvement opportunity.

The function properly handles notification data construction and error handling. However, consider handling broader exceptions during email operations.

 try:
     push_email_notification(base_notification=export_failed_notif_for_owner, **notification_data)
-except OSError:
+except (OSError, Exception) as e:
-    _logger.error(f"Export[{export.pk}] Error sending export failure notification")
+    _logger.error(f"Export[{export.pk}] Error sending export failure notification: {e}")
 else:
     export.is_user_notified = True
     export.save()
cohort/tests/tests_view_cohort_result.py (1)

237-237: Fix type annotation inconsistency.

The type annotation other_view: any should use Any (capitalized) to be consistent with the import from typing on line 3.

-                                    other_view: any, view_kwargs: dict):
+                                    other_view: Any, view_kwargs: dict):
exports/services/export.py (1)

37-54: Simplify export type handling.

Since ExportTypes inherits from StrEnum, the .value conversion is unnecessary.

-            export_type = ExportTypes(export_type).value
+            export_type = ExportTypes(export_type)
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5567056 and 1ede3be.

📒 Files selected for processing (45)
  • cohort/services/cohort_result.py (5 hunks)
  • cohort/services/dated_measure.py (2 hunks)
  • cohort/services/feasibility_study.py (2 hunks)
  • cohort/tests/test_service_cohort_result.py (1 hunks)
  • cohort/tests/tests_view_cohort_result.py (1 hunks)
  • cohort/tests/tests_view_dated_measure.py (1 hunks)
  • cohort/views/cohort_result.py (2 hunks)
  • cohort/views/dated_measure.py (2 hunks)
  • cohort/views/feasibility_study.py (2 hunks)
  • exporters/enums.py (2 hunks)
  • exporters/exceptions.py (1 hunks)
  • exporters/exporters/base_exporter.py (3 hunks)
  • exporters/exporters/csv_exporter.py (1 hunks)
  • exporters/exporters/hive_exporter.py (3 hunks)
  • exporters/exporters/xlsx_exporter.py (1 hunks)
  • exporters/notifications.py (4 hunks)
  • exporters/tasks.py (0 hunks)
  • exporters/tests/test_csv_exporter.py (0 hunks)
  • exporters/tests/test_hive_exporter.py (2 hunks)
  • exporters/tests/test_notifying_tasks.py (1 hunks)
  • exporters/tests/test_xlsx_exporter.py (0 hunks)
  • exports/export_tasks/__init__.py (1 hunks)
  • exports/export_tasks/await_cohorts.py (1 hunks)
  • exports/export_tasks/create_cohorts.py (1 hunks)
  • exports/export_tasks/create_tables.py (1 hunks)
  • exports/export_tasks/finalize.py (1 hunks)
  • exports/export_tasks/logs.py (1 hunks)
  • exports/export_tasks/notify_failed.py (1 hunks)
  • exports/export_tasks/notify_received.py (1 hunks)
  • exports/export_tasks/notify_succeeded.py (1 hunks)
  • exports/export_tasks/prep.py (1 hunks)
  • exports/export_tasks/send.py (1 hunks)
  • exports/export_tasks/track.py (1 hunks)
  • exports/models/datalab.py (1 hunks)
  • exports/services/export.py (1 hunks)
  • exports/services/export_operators.py (0 hunks)
  • exports/tasks.py (1 hunks)
  • exports/tests/test_export_manager.py (0 hunks)
  • exports/tests/test_export_tasks.py (1 hunks)
  • exports/tests/test_exporters_loader.py (1 hunks)
  • exports/tests/test_service_export.py (0 hunks)
  • exports/tests/test_view_datalab.py (1 hunks)
  • exports/tests/test_view_export.py (1 hunks)
  • exports/tools.py (1 hunks)
  • exports/views/export.py (3 hunks)
💤 Files with no reviewable changes (6)
  • exporters/tests/test_csv_exporter.py
  • exports/services/export_operators.py
  • exporters/tests/test_xlsx_exporter.py
  • exports/tests/test_service_export.py
  • exporters/tasks.py
  • exports/tests/test_export_manager.py
🧰 Additional context used
🧬 Code Graph Analysis (27)
exports/tasks.py (1)
exports/services/export_operators.py (1)
  • ExportCleaner (70-96)
exporters/exporters/xlsx_exporter.py (1)
exporters/enums.py (1)
  • ExportTypes (6-22)
cohort/views/feasibility_study.py (2)
cohort/services/utils.py (1)
  • get_authorization_header (68-72)
cohort/services/feasibility_study.py (1)
  • handle_feasibility_study_count (52-61)
exports/tests/test_view_datalab.py (1)
exports/tests/base_test.py (1)
  • check_test_create_view (92-95)
exporters/tests/test_hive_exporter.py (1)
exporters/exceptions.py (1)
  • HiveDBOwnershipException (9-10)
cohort/views/cohort_result.py (2)
cohort/services/utils.py (1)
  • get_authorization_header (68-72)
cohort/services/cohort_result.py (1)
  • handle_cohort_creation (82-105)
exports/export_tasks/logs.py (4)
exporters/exporters/base_exporter.py (1)
  • BaseExporter (22-116)
exports/tools.py (1)
  • get_export_by_id (7-11)
exports/tasks.py (1)
  • get_logs (20-22)
exporters/apis/base.py (1)
  • get_export_logs (20-28)
exports/export_tasks/notify_received.py (3)
exports/emails.py (1)
  • push_email_notification (17-19)
exports/tools.py (2)
  • get_export_by_id (7-11)
  • get_selected_tables (21-22)
admin_cohort/models/user.py (1)
  • display_name (22-24)
cohort/tests/test_service_cohort_result.py (1)
cohort/services/cohort_result.py (1)
  • create_cohort_subset (39-71)
cohort/views/dated_measure.py (2)
cohort/services/utils.py (2)
  • await_celery_task (22-35)
  • get_authorization_header (68-72)
cohort/services/dated_measure.py (1)
  • handle_count (14-26)
exports/export_tasks/notify_succeeded.py (3)
exports/emails.py (1)
  • push_email_notification (17-19)
exports/tools.py (3)
  • get_export_by_id (7-11)
  • get_cohort (15-17)
  • get_selected_tables (21-22)
admin_cohort/models/user.py (1)
  • display_name (22-24)
exporters/exporters/csv_exporter.py (1)
exporters/enums.py (1)
  • ExportTypes (6-22)
exports/export_tasks/finalize.py (4)
exporters/enums.py (1)
  • ExportTypes (6-22)
exporters/exceptions.py (1)
  • HiveDBOwnershipException (9-10)
exporters/exporters/hive_exporter.py (1)
  • change_db_ownership (78-83)
exports/tools.py (1)
  • get_export_by_id (7-11)
exporters/enums.py (1)
admin_cohort/types.py (2)
  • StrEnum (24-32)
  • JobStatus (35-53)
cohort/services/feasibility_study.py (1)
cohort/services/utils.py (1)
  • ServerError (18-19)
exports/export_tasks/notify_failed.py (6)
admin_cohort/types.py (1)
  • JobStatus (35-53)
exporters/notifications.py (2)
  • export_failed_notif_for_owner (48-59)
  • export_failed_notif_for_admins (62-72)
exports/emails.py (1)
  • push_email_notification (17-19)
exports/models/export.py (1)
  • Export (16-46)
exports/tools.py (2)
  • get_export_by_id (7-11)
  • get_cohort (15-17)
admin_cohort/models/user.py (1)
  • display_name (22-24)
exports/export_tasks/track.py (3)
admin_cohort/types.py (1)
  • JobStatus (35-53)
exporters/exporters/base_exporter.py (2)
  • BaseExporter (22-116)
  • track_job (92-112)
exports/tools.py (1)
  • get_export_by_id (7-11)
exports/export_tasks/send.py (4)
admin_cohort/types.py (1)
  • JobStatus (35-53)
exporters/exporters/base_exporter.py (1)
  • send_export (82-90)
exports/tools.py (1)
  • get_export_by_id (7-11)
exports/models/export.py (1)
  • target_full_path (36-39)
exports/export_tasks/create_tables.py (3)
cohort/models/cohort_result.py (1)
  • CohortResult (9-21)
exports/tools.py (1)
  • get_export_by_id (7-11)
exports/models/export_table.py (1)
  • ExportTable (9-22)
exports/tools.py (1)
exports/models/export.py (1)
  • Export (16-46)
exports/export_tasks/await_cohorts.py (2)
admin_cohort/types.py (1)
  • JobStatus (35-53)
exports/tools.py (1)
  • get_export_by_id (7-11)
cohort/tests/tests_view_dated_measure.py (2)
cohort/tests/tests_view_cohort_result.py (1)
  • check_create_case_with_mock (236-253)
cohort/tests/tests_view_rqs.py (1)
  • check_create_case_with_mock (214-217)
cohort/services/dated_measure.py (3)
cohort/services/utils.py (1)
  • ServerError (18-19)
cohort/models/cohort_result.py (1)
  • CohortResult (9-21)
cohort/views/dated_measure.py (1)
  • create (45-52)
cohort/services/cohort_result.py (2)
cohort/services/utils.py (1)
  • ServerError (18-19)
cohort/models/cohort_result.py (1)
  • CohortResult (9-21)
exports/export_tasks/create_cohorts.py (5)
cohort/models/cohort_result.py (1)
  • CohortResult (9-21)
cohort/models/fhir_filter.py (1)
  • FhirFilter (9-26)
exports/tools.py (1)
  • get_export_by_id (7-11)
exports/models/export.py (1)
  • Export (16-46)
cohort/services/cohort_result.py (2)
  • create_cohort_subset (39-71)
  • handle_cohort_creation (82-105)
exporters/exporters/hive_exporter.py (5)
exporters/exceptions.py (3)
  • InvalidJobId (1-2)
  • CreateHiveDBException (5-6)
  • HiveDBOwnershipException (9-10)
exporters/enums.py (2)
  • ExportTypes (6-22)
  • APIJobType (25-27)
exporters/exporters/base_exporter.py (2)
  • log_export_task (115-116)
  • track_job (92-112)
admin_cohort/types.py (1)
  • JobStatus (35-53)
exports/models/export.py (1)
  • Export (16-46)
exporters/exporters/base_exporter.py (3)
exports/models/export.py (1)
  • Export (16-46)
exporters/enums.py (1)
  • APIJobType (25-27)
admin_cohort/types.py (2)
  • JobStatus (35-53)
  • is_end_state (49-53)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: test
🔇 Additional comments (61)
exports/tasks.py (1)

8-8: No remaining references to launch_export_task found

I ran rg "launch_export_task" and an ast-grep search across all Python files—no hits. It appears the old launch_export_task call has been fully removed.

Please manually verify:

  • That all orchestration or CI workflows have been updated to use the new exports/export_tasks modules.
  • There are no hard-coded strings, doc comments, or config entries still referring to launch_export_task.
exporters/exporters/xlsx_exporter.py (1)

9-9: All exporters use StrEnum correctly – no .value usage remains

I’ve confirmed that in all three exporter classes the self.type assignments now use ExportTypes.<TYPE> directly without .value:

  • exporters/exporters/csv_exporter.py: self.type = ExportTypes.CSV
  • exporters/exporters/hive_exporter.py: self.type = ExportTypes.HIVE
  • exporters/exporters/xlsx_exporter.py: self.type = ExportTypes.XLSX

No further updates are required here.

cohort/views/feasibility_study.py (2)

15-15: LGTM! Consistent import addition for authorization header extraction.

The import of get_authorization_header aligns with the broader refactoring to decouple service methods from HTTP request objects.


52-54: LGTM! Authorization headers properly extracted and passed to service.

The refactoring correctly extracts authorization headers using get_authorization_header(request) and passes them explicitly to feasibility_study_service.handle_feasibility_study_count. This eliminates the service's dependency on the full HTTP request object, improving modularity and testability.

cohort/tests/test_service_cohort_result.py (1)

33-39: LGTM! Test simplified by removing unnecessary mocking.

The test now directly passes an empty auth_headers dictionary instead of mocking get_authorization_header, which aligns with the updated service method signature. This simplification improves test clarity while maintaining coverage of the core functionality.

cohort/views/cohort_result.py (2)

19-19: LGTM! Consistent import addition for authorization header handling.

The import of get_authorization_header follows the established pattern for decoupling service methods from HTTP request objects.


119-122: LGTM! Authorization headers properly extracted and passed to cohort service.

The refactoring correctly extracts authorization headers and passes them explicitly to cohort_service.handle_cohort_creation. This maintains the same functionality while improving service layer decoupling from HTTP request objects.

cohort/views/dated_measure.py (2)

13-13: LGTM! Authorization header utility imported consistently.

The addition of get_authorization_header to the existing import statement follows the established refactoring pattern.


47-51: LGTM! Authorization headers and stage details properly extracted and passed.

The refactoring correctly extracts authorization headers and stage details from the request, passing them explicitly to dm_service.handle_count. This eliminates the service's dependency on the HTTP request object while preserving all necessary data for processing.

exporters/notifications.py (4)

8-10: LGTM! Cleaner enum usage by using members directly.

Using ExportTypes.CSV instead of ExportTypes.CSV.value is more type-safe and cleaner. The enum members can be used directly in most contexts where their string values were previously needed.


31-31: LGTM! Consistent formatting with double quotes.

The switch from backticks to double quotes around the cohort name provides consistent formatting across notification messages.


49-49: LGTM! Consistent formatting applied.

The formatting change maintains consistency with other notification messages by using double quotes around the cohort name.


76-76: LGTM! Consistent string formatting.

The use of double quotes around the cohort name maintains formatting consistency across all notification functions.

exporters/tests/test_hive_exporter.py (2)

5-5: LGTM: Proper import of custom exception.

The import of HiveDBOwnershipException is correctly added to support the updated test expectations.


81-81: Excellent: More specific exception handling.

The test now expects HiveDBOwnershipException instead of the generic RequestException, which provides better error specificity and aligns with the new custom exception defined in exporters/exceptions.py.

exporters/exporters/csv_exporter.py (1)

14-14: LGTM: Consistent enum usage.

The change from ExportTypes.CSV.value to ExportTypes.CSV aligns with the enum refactoring where ExportTypes now inherits from StrEnum, making the direct enum member usage appropriate.

cohort/services/feasibility_study.py (2)

18-18: LGTM: Proper import of exception class.

Adding the ServerError import aligns with the error handling in the service method.


52-57: Excellent: Decoupling from HTTP request objects.

The refactoring to accept explicit auth_headers instead of the full request object improves separation of concerns and makes the service method more testable and reusable. The parameter usage is consistent throughout the method.

exporters/enums.py (3)

1-1: LGTM: Simplified import.

Removing the unused Enum import and keeping only StrEnum is appropriate for the refactoring.


6-6: LGTM: Consistent enum inheritance.

Changing to StrEnum inheritance enables direct enum member usage as strings throughout the codebase.


44-55: LGTM: Consistent enum member usage in mapper.

Using APIJobStatus enum members directly as keys instead of their .value strings is consistent with the StrEnum refactoring and improves code clarity.

exports/export_tasks/prep.py (1)

17-31: LGTM! Excellent error handling and early return pattern.

The implementation demonstrates good practices:

  • Early return when failure_reason is already set prevents unnecessary work
  • Specific exception handling for Hive operations
  • Appropriate logging throughout the process
  • Clean separation of concerns for Hive-specific preparation
exports/export_tasks/__init__.py (1)

1-25: LGTM! Well-structured package initialization.

The package API is cleanly organized with descriptive function names that clearly represent the export workflow steps. The __all__ list properly matches the imports, ensuring a clean public interface.

exports/export_tasks/track.py (1)

15-27: LGTM! Clean and straightforward job tracking implementation.

The task follows the established pattern with early failure detection, proper logging, and clear success/failure determination. The logic correctly treats any non-finished status as a failure condition.

cohort/tests/tests_view_dated_measure.py (1)

171-172: LGTM! Authorization header mocking removal aligns with refactoring.

The removal of get_authorization_header mocking is consistent with the broader refactoring where authorization header extraction is moved to the view layer and services now accept explicit auth_headers parameters.

exports/export_tasks/await_cohorts.py (1)

15-18: LGTM! Good early return pattern.

The early return when failure_reason is not None is appropriate for the task chain pattern, allowing failed upstream tasks to short-circuit subsequent operations.

exports/views/export.py (4)

21-21: LGTM! Good import for header extraction utility.

The import of get_authorization_header aligns with the refactoring to decouple service logic from HTTP request objects.


131-134: Excellent refactoring to use authorization headers.

The change to extract and pass authorization headers instead of the full request object is a good architectural improvement that:

  1. Decouples service logic from HTTP request objects
  2. Makes the authorization context explicit
  3. Improves testability and modularity

155-156: Consistent authorization header handling in retry.

Good consistency in applying the same authorization header extraction pattern to the retry operation.


125-125: Method name change confirmed consistent across the codebase

I’ve searched for any remaining uses of validate_export_data and found none. The switch to export_service.validate(...) is applied everywhere—approving the change.

exports/tests/test_view_export.py (3)

180-184: Good permission testing.

Proper test coverage for the permission enforcement on retry operations. The test correctly expects a 403 response for non-admin users.


186-194: Excellent service-level mocking.

The test properly mocks the export_service.retry method rather than internal task implementation details. This makes the tests more maintainable and focused on the view's responsibilities.


196-203: Good edge case coverage.

Testing the scenario where a user tries to retry a finished export ensures proper validation and error handling in the view layer.

exports/export_tasks/notify_failed.py (1)

29-41: Admin notification function looks good.

The function properly constructs comprehensive notification data for administrators including export metadata and system context.

exports/tests/test_exporters_loader.py (3)

21-27: Good error condition testing.

The test properly verifies that an ImproperlyConfigured exception is raised for invalid export types, ensuring robust validation in the loader.


29-32: Appropriate empty configuration testing.

Testing the empty exporters list scenario ensures the system fails fast with proper error messaging when misconfigured.


34-39: Comprehensive positive testing.

The test effectively verifies both the presence of expected export type keys and the correct loading of exporter classes, providing good coverage of the loading mechanism.

cohort/services/dated_measure.py (2)

14-26: LGTM! Clean refactoring to decouple from request object.

The refactoring to accept explicit auth_headers instead of the request object improves testability and modularity. The method correctly passes the auth headers to the Celery task and maintains proper error handling.


28-42: LGTM! Consistent with the refactoring pattern.

The method signature change aligns with the broader refactoring effort. The implementation correctly uses the cohort's owner and request_query_snapshot fields, and properly passes auth_headers to the task.

exporters/tests/test_notifying_tasks.py (2)

8-8: LGTM! Import updated to reflect new module structure.

The import statement correctly references the new export tasks module structure that replaces the old exporters.tasks module.


16-55: LGTM! Test methods properly updated for new task structure.

All test methods have been correctly updated to:

  • Use new module paths in @mock.patch decorators
  • Call the renamed tasks (e.g., mark_export_as_failed instead of notify_export_failed)
  • Maintain the same test logic and assertions

The refactoring maintains test coverage while adapting to the new modular task structure.

cohort/tests/tests_view_cohort_result.py (1)

236-253: LGTM! Test simplification aligns with refactoring.

The removal of get_authorization_header mocking is consistent with the broader refactoring where authorization headers are now passed explicitly rather than extracted from request objects. The test logic remains intact and focused on the core functionality.

cohort/services/cohort_result.py (2)

39-71: LGTM! Well-executed refactoring to decouple from request object.

The method signature change from request to auth_headers: dict is clean and consistent with the broader refactoring. The implementation correctly passes the auth headers to handle_cohort_creation and maintains all existing functionality.


82-105: LGTM! Good improvement with default parameter.

The refactoring correctly:

  • Accepts explicit auth_headers instead of request object
  • Adds a sensible default value for global_estimate=False
  • Passes auth headers to both dm_service.handle_global_count and the create_cohort.s task
  • Maintains proper error handling
exports/export_tasks/create_cohorts.py (5)

14-19: LGTM! Well-defined constants for table categorization.

The constants clearly define which tables require sub-cohorts and which should be excluded from processing. This makes the logic explicit and maintainable.


25-32: LGTM! Clean implementation of DocumentReference filter encoding.

The function properly URL-encodes the filter values and uses a dictionary-based approach that's easy to understand and maintain. The constant mapping provides clear association between table names and their filters.


35-41: LGTM! Auto-generated filter creation with proper naming.

The function correctly creates auto-generated FHIR filters with descriptive names that include the export UUID prefix and table name. The use of the first 8 characters of the UUID provides a reasonable balance between uniqueness and readability.


44-64: LGTM! Well-structured task with proper conditionals.

The task correctly:

  • Handles table processing with appropriate conditions
  • Auto-generates filters when needed for specific tables
  • Excludes inappropriate tables from processing
  • Uses proper logging for monitoring
  • Returns a useful mapping of created cohort subsets

67-78: LGTM! Robust retry logic with proper error handling.

The task implements good retry logic:

  • Iterates through failed cohort subsets
  • Catches specific ServerError exceptions
  • Logs both successes and failures appropriately
  • Returns failure reason for upstream handling
  • Breaks on first failure to avoid cascading issues
exporters/exporters/hive_exporter.py (4)

9-9: LGTM!

Good use of custom exceptions for better error handling and clarity.


21-21: LGTM!

Better to use the enum member directly since ExportTypes inherits from StrEnum and handles string conversion automatically.


78-83: LGTM!

Good improvement using a specific exception type for better error categorization.


73-73: Fix parameter passing to track_job method.

The job_type should be passed as a keyword argument, not positional.

-        job_status = self.track_job(export=export, job_type=APIJobType.HIVE_DB_CREATE)
+        job_status = self.track_job(export, job_type=APIJobType.HIVE_DB_CREATE)

Likely an incorrect or invalid review comment.

exporters/exporters/base_exporter.py (2)

31-33: LGTM!

Good validation to ensure FHIR filters are only applied when a cohort source is provided.


92-112: LGTM!

Good refactoring to simplify the method's responsibility. The polling interval and error handling are appropriate.

exports/tests/test_export_tasks.py (3)

15-41: Well-structured test setup!

Good use of CELERY_TASK_ALWAYS_EAGER for synchronous task execution in tests and comprehensive model setup.


194-194: LGTM!

Clear assertion with good documentation of the expected behavior.


203-319: Excellent test coverage for retry logic!

Comprehensive coverage of retry scenarios including API-side failures, failed cohort subsets, and soft-deleted table restoration.

exports/services/export.py (3)

77-101: Excellent task chain implementation!

Well-documented with clear warnings about modification. Proper use of .s() and .si() for task chaining.


108-162: Robust retry implementation!

Excellent handling of different failure scenarios with proper restoration of soft-deleted tables and conditional cohort subset relaunching.


165-171: LGTM!

Good error handling with appropriate logging before re-raising exceptions.

@thicham43 thicham43 force-pushed the 3065_rearrange_export_tasks branch from 1ede3be to efcbcb7 Compare July 31, 2025 13:14
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
exports/services/export.py (1)

127-128: Add safety check for empty export tables list.

Similar to the previous comment, accessing the first element without checking if the list is empty could cause an IndexError.

         source_cohorts_ids = [t.cohort_result_source_id for t in export.export_tables.all()]
+        if not source_cohorts_ids:
+            raise ValueError("No source cohorts found for export tables")
         main_cohort = CohortResult.objects.filter(uuid=source_cohorts_ids[0]).only("group_id", "name").first()
🧹 Nitpick comments (3)
exports/export_tasks/create_cohorts.py (1)

25-30: Consider URL-encoding the keys as well for robustness.

While the current keys don't contain special characters, it's safer to encode both keys and values to handle any future changes.

 def get_encoded_doc_ref_filter() -> str:
     filter_values = {"type:not": "https://terminology.eds.aphp.fr/aphp-orbis-document-textuel-hospitalier|doc-impor",
                      "contenttype": "text/plain"
                      }
-    return "&".join([f"{key}={quote_plus(val)}" for key, val in filter_values.items()])
+    return "&".join([f"{quote_plus(key)}={quote_plus(val)}" for key, val in filter_values.items()])
exports/tests/test_export_tasks.py (1)

194-194: Consider making the notification count assertion more explicit.

The comment mentions notifications for owner and admins, but the assertion just checks count=2. Consider asserting the specific recipients or adding constants for clarity.

-        self.assertEqual(mock_push_failure_notif.call_count, 2) # notif for owner and for admins
+        # Notifications should be sent to both owner and admins
+        EXPECTED_FAILURE_NOTIFICATIONS = 2  # 1 for owner, 1 for admins
+        self.assertEqual(mock_push_failure_notif.call_count, EXPECTED_FAILURE_NOTIFICATIONS)
exports/services/export.py (1)

68-74: Consider preserving specific exception types in validation.

The broad Exception catch might hide important error details. Consider catching and re-raising ValidationError while wrapping other exceptions.

     def validate(self, data: Dict, **kwargs) -> None:
         exporter = self._get_exporter(data.get("output_format"))
         try:
             exporter().validate(export_data=data, **kwargs)
+        except ValidationError:
+            raise
         except Exception as e:
             raise ValidationError(f'Invalid export data: {e}')
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1ede3be and efcbcb7.

📒 Files selected for processing (45)
  • cohort/services/cohort_result.py (5 hunks)
  • cohort/services/dated_measure.py (2 hunks)
  • cohort/services/feasibility_study.py (2 hunks)
  • cohort/tests/test_service_cohort_result.py (1 hunks)
  • cohort/tests/tests_view_cohort_result.py (1 hunks)
  • cohort/tests/tests_view_dated_measure.py (1 hunks)
  • cohort/views/cohort_result.py (2 hunks)
  • cohort/views/dated_measure.py (2 hunks)
  • cohort/views/feasibility_study.py (2 hunks)
  • exporters/enums.py (2 hunks)
  • exporters/exceptions.py (1 hunks)
  • exporters/exporters/base_exporter.py (3 hunks)
  • exporters/exporters/csv_exporter.py (1 hunks)
  • exporters/exporters/hive_exporter.py (3 hunks)
  • exporters/exporters/xlsx_exporter.py (1 hunks)
  • exporters/notifications.py (4 hunks)
  • exporters/tasks.py (0 hunks)
  • exporters/tests/test_csv_exporter.py (0 hunks)
  • exporters/tests/test_hive_exporter.py (3 hunks)
  • exporters/tests/test_notifying_tasks.py (1 hunks)
  • exporters/tests/test_xlsx_exporter.py (0 hunks)
  • exports/export_tasks/__init__.py (1 hunks)
  • exports/export_tasks/await_cohorts.py (1 hunks)
  • exports/export_tasks/create_cohorts.py (1 hunks)
  • exports/export_tasks/create_tables.py (1 hunks)
  • exports/export_tasks/finalize.py (1 hunks)
  • exports/export_tasks/logs.py (1 hunks)
  • exports/export_tasks/notify_failed.py (1 hunks)
  • exports/export_tasks/notify_received.py (1 hunks)
  • exports/export_tasks/notify_succeeded.py (1 hunks)
  • exports/export_tasks/prep.py (1 hunks)
  • exports/export_tasks/send.py (1 hunks)
  • exports/export_tasks/track.py (1 hunks)
  • exports/models/datalab.py (1 hunks)
  • exports/services/export.py (1 hunks)
  • exports/services/export_operators.py (0 hunks)
  • exports/tasks.py (1 hunks)
  • exports/tests/test_export_manager.py (0 hunks)
  • exports/tests/test_export_tasks.py (1 hunks)
  • exports/tests/test_exporters_loader.py (1 hunks)
  • exports/tests/test_service_export.py (0 hunks)
  • exports/tests/test_view_datalab.py (1 hunks)
  • exports/tests/test_view_export.py (1 hunks)
  • exports/tools.py (1 hunks)
  • exports/views/export.py (3 hunks)
💤 Files with no reviewable changes (6)
  • exporters/tests/test_csv_exporter.py
  • exports/services/export_operators.py
  • exporters/tests/test_xlsx_exporter.py
  • exports/tests/test_service_export.py
  • exporters/tasks.py
  • exports/tests/test_export_manager.py
✅ Files skipped from review due to trivial changes (5)
  • exports/tasks.py
  • cohort/views/feasibility_study.py
  • exporters/exceptions.py
  • exports/export_tasks/init.py
  • exports/tools.py
🚧 Files skipped from review as they are similar to previous changes (29)
  • cohort/tests/test_service_cohort_result.py
  • exporters/exporters/xlsx_exporter.py
  • exporters/notifications.py
  • exporters/exporters/csv_exporter.py
  • exports/models/datalab.py
  • cohort/views/cohort_result.py
  • exports/export_tasks/prep.py
  • cohort/views/dated_measure.py
  • exports/views/export.py
  • exports/export_tasks/notify_received.py
  • exports/tests/test_view_export.py
  • exports/export_tasks/track.py
  • exports/tests/test_exporters_loader.py
  • exports/export_tasks/send.py
  • exports/export_tasks/finalize.py
  • exports/tests/test_view_datalab.py
  • exporters/enums.py
  • exports/export_tasks/notify_failed.py
  • exports/export_tasks/logs.py
  • cohort/services/feasibility_study.py
  • exports/export_tasks/create_tables.py
  • cohort/tests/tests_view_dated_measure.py
  • cohort/tests/tests_view_cohort_result.py
  • exports/export_tasks/notify_succeeded.py
  • exports/export_tasks/await_cohorts.py
  • exporters/tests/test_notifying_tasks.py
  • exporters/exporters/base_exporter.py
  • cohort/services/dated_measure.py
  • cohort/services/cohort_result.py
🧰 Additional context used
🧬 Code Graph Analysis (3)
exporters/tests/test_hive_exporter.py (1)
exporters/exceptions.py (2)
  • HiveDBOwnershipException (9-10)
  • CreateHiveDBException (5-6)
exporters/exporters/hive_exporter.py (5)
exporters/exceptions.py (3)
  • InvalidJobId (1-2)
  • CreateHiveDBException (5-6)
  • HiveDBOwnershipException (9-10)
exporters/enums.py (2)
  • ExportTypes (6-22)
  • APIJobType (25-27)
exporters/exporters/base_exporter.py (2)
  • log_export_task (115-116)
  • track_job (92-112)
admin_cohort/types.py (1)
  • JobStatus (35-53)
exports/models/export.py (1)
  • Export (16-46)
exports/export_tasks/create_cohorts.py (5)
cohort/models/cohort_result.py (1)
  • CohortResult (9-21)
cohort/models/fhir_filter.py (1)
  • FhirFilter (9-26)
exports/tools.py (1)
  • get_export_by_id (7-11)
exports/models/export.py (1)
  • Export (16-46)
cohort/services/cohort_result.py (2)
  • create_cohort_subset (39-71)
  • handle_cohort_creation (82-105)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: test
🔇 Additional comments (2)
exporters/tests/test_hive_exporter.py (1)

5-5: Good improvement in error handling specificity!

The changes from generic RequestException to specific exceptions (CreateHiveDBException and HiveDBOwnershipException) improve error handling clarity and make the tests more precise about what type of failures they're testing.

Also applies to: 71-72, 81-82

exporters/exporters/hive_exporter.py (1)

66-76: Excellent improvements to error handling and job tracking!

The changes properly save the job ID for tracking, provide specific exception handling, and ensure the Hive DB is successfully created before proceeding. The error messages are also descriptive and helpful for debugging.

@thicham43 thicham43 force-pushed the 3065_rearrange_export_tasks branch 2 times, most recently from b7dceb2 to e2e0a26 Compare July 31, 2025 14:19
@sonarqubecloud
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants