-
Notifications
You must be signed in to change notification settings - Fork 1
refactor(exports): rearrange code and chain tasks #492
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughThis update introduces a major refactor and modularization of the export workflow, transitioning from synchronous, inline logic to a robust, asynchronous Celery task chain architecture. Export table creation, cohort subset management, export launching, job tracking, and notification handling are now implemented as discrete, reusable Celery tasks. The export service and exporters are refactored for clearer validation, error handling, and extensibility. Related tests are updated, removed, or added to align with the new workflow, and supporting modules are introduced for common tools and exceptions. Changes
Sequence Diagram(s)High-Level Export Workflow (New)sequenceDiagram
participant User
participant API/View
participant ExportService
participant Celery
participant Exporter
participant DB
User->>API/View: Create Export Request
API/View->>ExportService: validate(data)
API/View->>Celery: on_commit(proceed_with_export)
Celery->>ExportService: proceed_with_export(export, tables, auth_headers)
ExportService->>Celery: Chain of Tasks (notify, create cohorts, create tables, await, prep, send, track, finalize, notify success/failure)
loop Export Task Chain
Celery->>DB: Read/Write Export, Cohort, Tables
Celery->>Exporter: prepare/send/track/finalize
Exporter->>DB: Update Export Status
end
Celery->>API/View: Completion/Failure Callback (async)
Export Retry FlowsequenceDiagram
participant User
participant API/View
participant ExportService
participant Celery
User->>API/View: Retry Export (as Admin)
API/View->>ExportService: retry(export, auth_headers)
ExportService->>Celery: Chain of Tasks (restore tables, relaunch cohorts if needed, continue export chain)
Celery->>ExportService: proceed_with_export (as above)
Estimated code review effort🎯 5 (Critical) | ⏱️ ~90+ minutes Poem
Note ⚡️ Unit Test Generation is now available in beta!Learn more here, or try it out under "Finishing Touches" below. 📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (44)
💤 Files with no reviewable changes (6)
✅ Files skipped from review due to trivial changes (4)
🚧 Files skipped from review as they are similar to previous changes (29)
🧰 Additional context used🧬 Code Graph Analysis (2)exporters/exporters/base_exporter.py (3)
exporters/exporters/hive_exporter.py (5)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
🔇 Additional comments (21)
✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 18
🧹 Nitpick comments (15)
exports/tests/test_view_datalab.py (1)
46-52: LGTM! Test correctly validates the new naming constraints.The test appropriately verifies that datalab creation fails with HTTP 400 for names containing spaces or hyphens, which aligns with the updated model validation.
Consider adding a test case with a valid name to ensure the validation doesn't have false positives:
def test_create_datalab_with_invalid_names(self): create_url = reverse(viewname=self.viewname_list) for name in ["name-with-hyphens", "name with spaces"]: self.check_test_create_view(request_user=self.datalabs_manager_user, create_url=create_url, request_data={"name": name, "infrastructure_provider": self.infra_provider_aphp.uuid}, expected_resp_status=status.HTTP_400_BAD_REQUEST) + + # Verify valid names still work + self.check_test_create_view(request_user=self.datalabs_manager_user, + create_url=create_url, + request_data={"name": "valid_name_123", "infrastructure_provider": self.infra_provider_aphp.uuid}, + expected_resp_status=status.HTTP_201_CREATED)exports/export_tasks/logs.py (1)
9-10: LGTM! Improved error handling with helper function.Using
get_export_by_idinstead of direct database query provides better error handling with a descriptive error message for missing exports.Consider adding logging for consistency with other export tasks:
+import logging + +_logger = logging.getLogger(__name__) + @shared_task def get_logs(export_id: str) -> dict: + _logger.info(f"Getting logs for export {export_id}") export = get_export_by_id(export_id) return BaseExporter().export_api.get_export_logs(job_id=export.request_job_id)exporters/exceptions.py (1)
1-10: LGTM: Well-defined custom exceptions.The custom exception classes are properly implemented and provide better error specificity than generic exceptions. The naming is clear and descriptive.
Consider adding docstrings or additional context attributes in the future to make these exceptions more informative for debugging purposes.
exports/export_tasks/notify_received.py (2)
9-9: Consider using standard logger naming convention.Using
'info'as the logger name is unconventional. Consider using__name__or the module name for better logging hierarchy and debugging.-_logger = logging.getLogger('info') +_logger = logging.getLogger(__name__)
13-13: Consider making cohort parameters optional.The
cohort_idandcohort_nameparameters might not always be available depending on the export context. Consider making them optional with default values.-def notify_export_received(export_id: str, cohort_id: str, cohort_name: str) -> None: +def notify_export_received(export_id: str, cohort_id: str = None, cohort_name: str = None) -> None:exports/export_tasks/create_tables.py (1)
32-32: Remove unnecessary explicit return.The explicit
return Noneis unnecessary as Python functions returnNoneby default when no return statement is present.- return Noneexports/export_tasks/notify_succeeded.py (3)
24-25: Use more explicit conditional expressions for cohort data.The current pattern
cohort and cohort.group_id or Nonecould be clearer and more explicit.- cohort_id=cohort and cohort.group_id or None, - cohort_name=cohort and cohort.name or None, + cohort_id=cohort.group_id if cohort else None, + cohort_name=cohort.name if cohort else None,
32-33: Consider broader exception handling for email notifications.Only catching
OSErrormight be too narrow. Email sending could fail due to various reasons including network issues, authentication problems, or template errors.- except OSError: - _logger.error(f"[Export {export_id}] Error sending export success email") + except (OSError, Exception) as e: + _logger.error(f"[Export {export_id}] Error sending export success email: {e}")
31-31: Fix logging inconsistency between export.uuid and export_id.Line 31 uses
export.uuidwhile the rest of the function usesexport_idfor logging context.- _logger.info(f"Export[{export.uuid}] Export success email sent") + _logger.info(f"Export[{export_id}] Export success email sent")exports/export_tasks/send.py (1)
61-61: Set failure_reason to None explicitly on success path.The function returns
failure_reasonbut it's not explicitly set toNoneon the success path, relying on the input parameter value._logger.info(f"Export[{export_id}] Request sent, job `{job_id}` is now {JobStatus.pending}") + failure_reason = None return failure_reasoncohort/tests/tests_view_dated_measure.py (1)
120-124: Remove unused mock_header_called parameter from DMCreateCase.The
mock_header_calledparameter is no longer used since the authorization header mocking was removed, but it's still defined in the class constructor and passed in test cases.class DMCreateCase(CreateCase): - def __init__(self, mock_header_called: bool, mock_count_task_called: bool, mock_cancel_task_called: bool, **kwargs): + def __init__(self, mock_count_task_called: bool, mock_cancel_task_called: bool, **kwargs): super(DMCreateCase, self).__init__(**kwargs) - self.mock_header_called = mock_header_called self.mock_count_task_called = mock_count_task_called self.mock_cancel_task_called = mock_cancel_task_calledYou'll also need to update the test case definitions that pass
mock_header_called=True/False.exports/export_tasks/await_cohorts.py (1)
31-35: Consider using JobStatus.is_end_state property.The code could leverage the
is_end_stateproperty from theJobStatusenum to handle terminal states more comprehensively.for c in cohort_subsets: - if c.request_job_status == JobStatus.failed: + if c.request_job_status.is_end_state: + if c.request_job_status == JobStatus.failed: + failure_reason = "One or more cohort subsets have failed" + _logger.info(f"Export[{export_id}] Aborting export... reason: {failure_reason}") + return failure_reason + elif c.request_job_status == JobStatus.finished: + finished_cohorts += 1 + # Handle other end states like cancelled/unknown as failures + elif c.request_job_status in [JobStatus.cancelled, JobStatus.unknown]: failure_reason = "One or more cohort subsets have failed" _logger.info(f"Export[{export_id}] Aborting export... reason: {failure_reason}") return failure_reason - if c.request_job_status == JobStatus.finished: - finished_cohorts += 1exports/export_tasks/notify_failed.py (1)
15-27: Good notification logic with minor improvement opportunity.The function properly handles notification data construction and error handling. However, consider handling broader exceptions during email operations.
try: push_email_notification(base_notification=export_failed_notif_for_owner, **notification_data) -except OSError: +except (OSError, Exception) as e: - _logger.error(f"Export[{export.pk}] Error sending export failure notification") + _logger.error(f"Export[{export.pk}] Error sending export failure notification: {e}") else: export.is_user_notified = True export.save()cohort/tests/tests_view_cohort_result.py (1)
237-237: Fix type annotation inconsistency.The type annotation
other_view: anyshould useAny(capitalized) to be consistent with the import fromtypingon line 3.- other_view: any, view_kwargs: dict): + other_view: Any, view_kwargs: dict):exports/services/export.py (1)
37-54: Simplify export type handling.Since
ExportTypesinherits fromStrEnum, the.valueconversion is unnecessary.- export_type = ExportTypes(export_type).value + export_type = ExportTypes(export_type)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (45)
cohort/services/cohort_result.py(5 hunks)cohort/services/dated_measure.py(2 hunks)cohort/services/feasibility_study.py(2 hunks)cohort/tests/test_service_cohort_result.py(1 hunks)cohort/tests/tests_view_cohort_result.py(1 hunks)cohort/tests/tests_view_dated_measure.py(1 hunks)cohort/views/cohort_result.py(2 hunks)cohort/views/dated_measure.py(2 hunks)cohort/views/feasibility_study.py(2 hunks)exporters/enums.py(2 hunks)exporters/exceptions.py(1 hunks)exporters/exporters/base_exporter.py(3 hunks)exporters/exporters/csv_exporter.py(1 hunks)exporters/exporters/hive_exporter.py(3 hunks)exporters/exporters/xlsx_exporter.py(1 hunks)exporters/notifications.py(4 hunks)exporters/tasks.py(0 hunks)exporters/tests/test_csv_exporter.py(0 hunks)exporters/tests/test_hive_exporter.py(2 hunks)exporters/tests/test_notifying_tasks.py(1 hunks)exporters/tests/test_xlsx_exporter.py(0 hunks)exports/export_tasks/__init__.py(1 hunks)exports/export_tasks/await_cohorts.py(1 hunks)exports/export_tasks/create_cohorts.py(1 hunks)exports/export_tasks/create_tables.py(1 hunks)exports/export_tasks/finalize.py(1 hunks)exports/export_tasks/logs.py(1 hunks)exports/export_tasks/notify_failed.py(1 hunks)exports/export_tasks/notify_received.py(1 hunks)exports/export_tasks/notify_succeeded.py(1 hunks)exports/export_tasks/prep.py(1 hunks)exports/export_tasks/send.py(1 hunks)exports/export_tasks/track.py(1 hunks)exports/models/datalab.py(1 hunks)exports/services/export.py(1 hunks)exports/services/export_operators.py(0 hunks)exports/tasks.py(1 hunks)exports/tests/test_export_manager.py(0 hunks)exports/tests/test_export_tasks.py(1 hunks)exports/tests/test_exporters_loader.py(1 hunks)exports/tests/test_service_export.py(0 hunks)exports/tests/test_view_datalab.py(1 hunks)exports/tests/test_view_export.py(1 hunks)exports/tools.py(1 hunks)exports/views/export.py(3 hunks)
💤 Files with no reviewable changes (6)
- exporters/tests/test_csv_exporter.py
- exports/services/export_operators.py
- exporters/tests/test_xlsx_exporter.py
- exports/tests/test_service_export.py
- exporters/tasks.py
- exports/tests/test_export_manager.py
🧰 Additional context used
🧬 Code Graph Analysis (27)
exports/tasks.py (1)
exports/services/export_operators.py (1)
ExportCleaner(70-96)
exporters/exporters/xlsx_exporter.py (1)
exporters/enums.py (1)
ExportTypes(6-22)
cohort/views/feasibility_study.py (2)
cohort/services/utils.py (1)
get_authorization_header(68-72)cohort/services/feasibility_study.py (1)
handle_feasibility_study_count(52-61)
exports/tests/test_view_datalab.py (1)
exports/tests/base_test.py (1)
check_test_create_view(92-95)
exporters/tests/test_hive_exporter.py (1)
exporters/exceptions.py (1)
HiveDBOwnershipException(9-10)
cohort/views/cohort_result.py (2)
cohort/services/utils.py (1)
get_authorization_header(68-72)cohort/services/cohort_result.py (1)
handle_cohort_creation(82-105)
exports/export_tasks/logs.py (4)
exporters/exporters/base_exporter.py (1)
BaseExporter(22-116)exports/tools.py (1)
get_export_by_id(7-11)exports/tasks.py (1)
get_logs(20-22)exporters/apis/base.py (1)
get_export_logs(20-28)
exports/export_tasks/notify_received.py (3)
exports/emails.py (1)
push_email_notification(17-19)exports/tools.py (2)
get_export_by_id(7-11)get_selected_tables(21-22)admin_cohort/models/user.py (1)
display_name(22-24)
cohort/tests/test_service_cohort_result.py (1)
cohort/services/cohort_result.py (1)
create_cohort_subset(39-71)
cohort/views/dated_measure.py (2)
cohort/services/utils.py (2)
await_celery_task(22-35)get_authorization_header(68-72)cohort/services/dated_measure.py (1)
handle_count(14-26)
exports/export_tasks/notify_succeeded.py (3)
exports/emails.py (1)
push_email_notification(17-19)exports/tools.py (3)
get_export_by_id(7-11)get_cohort(15-17)get_selected_tables(21-22)admin_cohort/models/user.py (1)
display_name(22-24)
exporters/exporters/csv_exporter.py (1)
exporters/enums.py (1)
ExportTypes(6-22)
exports/export_tasks/finalize.py (4)
exporters/enums.py (1)
ExportTypes(6-22)exporters/exceptions.py (1)
HiveDBOwnershipException(9-10)exporters/exporters/hive_exporter.py (1)
change_db_ownership(78-83)exports/tools.py (1)
get_export_by_id(7-11)
exporters/enums.py (1)
admin_cohort/types.py (2)
StrEnum(24-32)JobStatus(35-53)
cohort/services/feasibility_study.py (1)
cohort/services/utils.py (1)
ServerError(18-19)
exports/export_tasks/notify_failed.py (6)
admin_cohort/types.py (1)
JobStatus(35-53)exporters/notifications.py (2)
export_failed_notif_for_owner(48-59)export_failed_notif_for_admins(62-72)exports/emails.py (1)
push_email_notification(17-19)exports/models/export.py (1)
Export(16-46)exports/tools.py (2)
get_export_by_id(7-11)get_cohort(15-17)admin_cohort/models/user.py (1)
display_name(22-24)
exports/export_tasks/track.py (3)
admin_cohort/types.py (1)
JobStatus(35-53)exporters/exporters/base_exporter.py (2)
BaseExporter(22-116)track_job(92-112)exports/tools.py (1)
get_export_by_id(7-11)
exports/export_tasks/send.py (4)
admin_cohort/types.py (1)
JobStatus(35-53)exporters/exporters/base_exporter.py (1)
send_export(82-90)exports/tools.py (1)
get_export_by_id(7-11)exports/models/export.py (1)
target_full_path(36-39)
exports/export_tasks/create_tables.py (3)
cohort/models/cohort_result.py (1)
CohortResult(9-21)exports/tools.py (1)
get_export_by_id(7-11)exports/models/export_table.py (1)
ExportTable(9-22)
exports/tools.py (1)
exports/models/export.py (1)
Export(16-46)
exports/export_tasks/await_cohorts.py (2)
admin_cohort/types.py (1)
JobStatus(35-53)exports/tools.py (1)
get_export_by_id(7-11)
cohort/tests/tests_view_dated_measure.py (2)
cohort/tests/tests_view_cohort_result.py (1)
check_create_case_with_mock(236-253)cohort/tests/tests_view_rqs.py (1)
check_create_case_with_mock(214-217)
cohort/services/dated_measure.py (3)
cohort/services/utils.py (1)
ServerError(18-19)cohort/models/cohort_result.py (1)
CohortResult(9-21)cohort/views/dated_measure.py (1)
create(45-52)
cohort/services/cohort_result.py (2)
cohort/services/utils.py (1)
ServerError(18-19)cohort/models/cohort_result.py (1)
CohortResult(9-21)
exports/export_tasks/create_cohorts.py (5)
cohort/models/cohort_result.py (1)
CohortResult(9-21)cohort/models/fhir_filter.py (1)
FhirFilter(9-26)exports/tools.py (1)
get_export_by_id(7-11)exports/models/export.py (1)
Export(16-46)cohort/services/cohort_result.py (2)
create_cohort_subset(39-71)handle_cohort_creation(82-105)
exporters/exporters/hive_exporter.py (5)
exporters/exceptions.py (3)
InvalidJobId(1-2)CreateHiveDBException(5-6)HiveDBOwnershipException(9-10)exporters/enums.py (2)
ExportTypes(6-22)APIJobType(25-27)exporters/exporters/base_exporter.py (2)
log_export_task(115-116)track_job(92-112)admin_cohort/types.py (1)
JobStatus(35-53)exports/models/export.py (1)
Export(16-46)
exporters/exporters/base_exporter.py (3)
exports/models/export.py (1)
Export(16-46)exporters/enums.py (1)
APIJobType(25-27)admin_cohort/types.py (2)
JobStatus(35-53)is_end_state(49-53)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: test
🔇 Additional comments (61)
exports/tasks.py (1)
8-8: No remaining references tolaunch_export_taskfoundI ran
rg "launch_export_task"and anast-grepsearch across all Python files—no hits. It appears the oldlaunch_export_taskcall has been fully removed.Please manually verify:
- That all orchestration or CI workflows have been updated to use the new
exports/export_tasksmodules.- There are no hard-coded strings, doc comments, or config entries still referring to
launch_export_task.exporters/exporters/xlsx_exporter.py (1)
9-9: All exporters use StrEnum correctly – no.valueusage remainsI’ve confirmed that in all three exporter classes the
self.typeassignments now useExportTypes.<TYPE>directly without.value:
- exporters/exporters/csv_exporter.py:
self.type = ExportTypes.CSV- exporters/exporters/hive_exporter.py:
self.type = ExportTypes.HIVE- exporters/exporters/xlsx_exporter.py:
self.type = ExportTypes.XLSXNo further updates are required here.
cohort/views/feasibility_study.py (2)
15-15: LGTM! Consistent import addition for authorization header extraction.The import of
get_authorization_headeraligns with the broader refactoring to decouple service methods from HTTP request objects.
52-54: LGTM! Authorization headers properly extracted and passed to service.The refactoring correctly extracts authorization headers using
get_authorization_header(request)and passes them explicitly tofeasibility_study_service.handle_feasibility_study_count. This eliminates the service's dependency on the full HTTP request object, improving modularity and testability.cohort/tests/test_service_cohort_result.py (1)
33-39: LGTM! Test simplified by removing unnecessary mocking.The test now directly passes an empty
auth_headersdictionary instead of mockingget_authorization_header, which aligns with the updated service method signature. This simplification improves test clarity while maintaining coverage of the core functionality.cohort/views/cohort_result.py (2)
19-19: LGTM! Consistent import addition for authorization header handling.The import of
get_authorization_headerfollows the established pattern for decoupling service methods from HTTP request objects.
119-122: LGTM! Authorization headers properly extracted and passed to cohort service.The refactoring correctly extracts authorization headers and passes them explicitly to
cohort_service.handle_cohort_creation. This maintains the same functionality while improving service layer decoupling from HTTP request objects.cohort/views/dated_measure.py (2)
13-13: LGTM! Authorization header utility imported consistently.The addition of
get_authorization_headerto the existing import statement follows the established refactoring pattern.
47-51: LGTM! Authorization headers and stage details properly extracted and passed.The refactoring correctly extracts authorization headers and stage details from the request, passing them explicitly to
dm_service.handle_count. This eliminates the service's dependency on the HTTP request object while preserving all necessary data for processing.exporters/notifications.py (4)
8-10: LGTM! Cleaner enum usage by using members directly.Using
ExportTypes.CSVinstead ofExportTypes.CSV.valueis more type-safe and cleaner. The enum members can be used directly in most contexts where their string values were previously needed.
31-31: LGTM! Consistent formatting with double quotes.The switch from backticks to double quotes around the cohort name provides consistent formatting across notification messages.
49-49: LGTM! Consistent formatting applied.The formatting change maintains consistency with other notification messages by using double quotes around the cohort name.
76-76: LGTM! Consistent string formatting.The use of double quotes around the cohort name maintains formatting consistency across all notification functions.
exporters/tests/test_hive_exporter.py (2)
5-5: LGTM: Proper import of custom exception.The import of
HiveDBOwnershipExceptionis correctly added to support the updated test expectations.
81-81: Excellent: More specific exception handling.The test now expects
HiveDBOwnershipExceptioninstead of the genericRequestException, which provides better error specificity and aligns with the new custom exception defined inexporters/exceptions.py.exporters/exporters/csv_exporter.py (1)
14-14: LGTM: Consistent enum usage.The change from
ExportTypes.CSV.valuetoExportTypes.CSValigns with the enum refactoring whereExportTypesnow inherits fromStrEnum, making the direct enum member usage appropriate.cohort/services/feasibility_study.py (2)
18-18: LGTM: Proper import of exception class.Adding the
ServerErrorimport aligns with the error handling in the service method.
52-57: Excellent: Decoupling from HTTP request objects.The refactoring to accept explicit
auth_headersinstead of the fullrequestobject improves separation of concerns and makes the service method more testable and reusable. The parameter usage is consistent throughout the method.exporters/enums.py (3)
1-1: LGTM: Simplified import.Removing the unused
Enumimport and keeping onlyStrEnumis appropriate for the refactoring.
6-6: LGTM: Consistent enum inheritance.Changing to
StrEnuminheritance enables direct enum member usage as strings throughout the codebase.
44-55: LGTM: Consistent enum member usage in mapper.Using
APIJobStatusenum members directly as keys instead of their.valuestrings is consistent with theStrEnumrefactoring and improves code clarity.exports/export_tasks/prep.py (1)
17-31: LGTM! Excellent error handling and early return pattern.The implementation demonstrates good practices:
- Early return when failure_reason is already set prevents unnecessary work
- Specific exception handling for Hive operations
- Appropriate logging throughout the process
- Clean separation of concerns for Hive-specific preparation
exports/export_tasks/__init__.py (1)
1-25: LGTM! Well-structured package initialization.The package API is cleanly organized with descriptive function names that clearly represent the export workflow steps. The
__all__list properly matches the imports, ensuring a clean public interface.exports/export_tasks/track.py (1)
15-27: LGTM! Clean and straightforward job tracking implementation.The task follows the established pattern with early failure detection, proper logging, and clear success/failure determination. The logic correctly treats any non-finished status as a failure condition.
cohort/tests/tests_view_dated_measure.py (1)
171-172: LGTM! Authorization header mocking removal aligns with refactoring.The removal of
get_authorization_headermocking is consistent with the broader refactoring where authorization header extraction is moved to the view layer and services now accept explicitauth_headersparameters.exports/export_tasks/await_cohorts.py (1)
15-18: LGTM! Good early return pattern.The early return when
failure_reasonis not None is appropriate for the task chain pattern, allowing failed upstream tasks to short-circuit subsequent operations.exports/views/export.py (4)
21-21: LGTM! Good import for header extraction utility.The import of
get_authorization_headeraligns with the refactoring to decouple service logic from HTTP request objects.
131-134: Excellent refactoring to use authorization headers.The change to extract and pass authorization headers instead of the full request object is a good architectural improvement that:
- Decouples service logic from HTTP request objects
- Makes the authorization context explicit
- Improves testability and modularity
155-156: Consistent authorization header handling in retry.Good consistency in applying the same authorization header extraction pattern to the retry operation.
125-125: Method name change confirmed consistent across the codebaseI’ve searched for any remaining uses of
validate_export_dataand found none. The switch toexport_service.validate(...)is applied everywhere—approving the change.exports/tests/test_view_export.py (3)
180-184: Good permission testing.Proper test coverage for the permission enforcement on retry operations. The test correctly expects a 403 response for non-admin users.
186-194: Excellent service-level mocking.The test properly mocks the
export_service.retrymethod rather than internal task implementation details. This makes the tests more maintainable and focused on the view's responsibilities.
196-203: Good edge case coverage.Testing the scenario where a user tries to retry a finished export ensures proper validation and error handling in the view layer.
exports/export_tasks/notify_failed.py (1)
29-41: Admin notification function looks good.The function properly constructs comprehensive notification data for administrators including export metadata and system context.
exports/tests/test_exporters_loader.py (3)
21-27: Good error condition testing.The test properly verifies that an
ImproperlyConfiguredexception is raised for invalid export types, ensuring robust validation in the loader.
29-32: Appropriate empty configuration testing.Testing the empty exporters list scenario ensures the system fails fast with proper error messaging when misconfigured.
34-39: Comprehensive positive testing.The test effectively verifies both the presence of expected export type keys and the correct loading of exporter classes, providing good coverage of the loading mechanism.
cohort/services/dated_measure.py (2)
14-26: LGTM! Clean refactoring to decouple from request object.The refactoring to accept explicit
auth_headersinstead of therequestobject improves testability and modularity. The method correctly passes the auth headers to the Celery task and maintains proper error handling.
28-42: LGTM! Consistent with the refactoring pattern.The method signature change aligns with the broader refactoring effort. The implementation correctly uses the cohort's owner and request_query_snapshot fields, and properly passes auth_headers to the task.
exporters/tests/test_notifying_tasks.py (2)
8-8: LGTM! Import updated to reflect new module structure.The import statement correctly references the new export tasks module structure that replaces the old
exporters.tasksmodule.
16-55: LGTM! Test methods properly updated for new task structure.All test methods have been correctly updated to:
- Use new module paths in
@mock.patchdecorators- Call the renamed tasks (e.g.,
mark_export_as_failedinstead ofnotify_export_failed)- Maintain the same test logic and assertions
The refactoring maintains test coverage while adapting to the new modular task structure.
cohort/tests/tests_view_cohort_result.py (1)
236-253: LGTM! Test simplification aligns with refactoring.The removal of
get_authorization_headermocking is consistent with the broader refactoring where authorization headers are now passed explicitly rather than extracted from request objects. The test logic remains intact and focused on the core functionality.cohort/services/cohort_result.py (2)
39-71: LGTM! Well-executed refactoring to decouple from request object.The method signature change from
requesttoauth_headers: dictis clean and consistent with the broader refactoring. The implementation correctly passes the auth headers tohandle_cohort_creationand maintains all existing functionality.
82-105: LGTM! Good improvement with default parameter.The refactoring correctly:
- Accepts explicit
auth_headersinstead of request object- Adds a sensible default value for
global_estimate=False- Passes auth headers to both
dm_service.handle_global_countand thecreate_cohort.stask- Maintains proper error handling
exports/export_tasks/create_cohorts.py (5)
14-19: LGTM! Well-defined constants for table categorization.The constants clearly define which tables require sub-cohorts and which should be excluded from processing. This makes the logic explicit and maintainable.
25-32: LGTM! Clean implementation of DocumentReference filter encoding.The function properly URL-encodes the filter values and uses a dictionary-based approach that's easy to understand and maintain. The constant mapping provides clear association between table names and their filters.
35-41: LGTM! Auto-generated filter creation with proper naming.The function correctly creates auto-generated FHIR filters with descriptive names that include the export UUID prefix and table name. The use of the first 8 characters of the UUID provides a reasonable balance between uniqueness and readability.
44-64: LGTM! Well-structured task with proper conditionals.The task correctly:
- Handles table processing with appropriate conditions
- Auto-generates filters when needed for specific tables
- Excludes inappropriate tables from processing
- Uses proper logging for monitoring
- Returns a useful mapping of created cohort subsets
67-78: LGTM! Robust retry logic with proper error handling.The task implements good retry logic:
- Iterates through failed cohort subsets
- Catches specific
ServerErrorexceptions- Logs both successes and failures appropriately
- Returns failure reason for upstream handling
- Breaks on first failure to avoid cascading issues
exporters/exporters/hive_exporter.py (4)
9-9: LGTM!Good use of custom exceptions for better error handling and clarity.
21-21: LGTM!Better to use the enum member directly since
ExportTypesinherits fromStrEnumand handles string conversion automatically.
78-83: LGTM!Good improvement using a specific exception type for better error categorization.
73-73: Fix parameter passing to track_job method.The
job_typeshould be passed as a keyword argument, not positional.- job_status = self.track_job(export=export, job_type=APIJobType.HIVE_DB_CREATE) + job_status = self.track_job(export, job_type=APIJobType.HIVE_DB_CREATE)Likely an incorrect or invalid review comment.
exporters/exporters/base_exporter.py (2)
31-33: LGTM!Good validation to ensure FHIR filters are only applied when a cohort source is provided.
92-112: LGTM!Good refactoring to simplify the method's responsibility. The polling interval and error handling are appropriate.
exports/tests/test_export_tasks.py (3)
15-41: Well-structured test setup!Good use of
CELERY_TASK_ALWAYS_EAGERfor synchronous task execution in tests and comprehensive model setup.
194-194: LGTM!Clear assertion with good documentation of the expected behavior.
203-319: Excellent test coverage for retry logic!Comprehensive coverage of retry scenarios including API-side failures, failed cohort subsets, and soft-deleted table restoration.
exports/services/export.py (3)
77-101: Excellent task chain implementation!Well-documented with clear warnings about modification. Proper use of
.s()and.si()for task chaining.
108-162: Robust retry implementation!Excellent handling of different failure scenarios with proper restoration of soft-deleted tables and conditional cohort subset relaunching.
165-171: LGTM!Good error handling with appropriate logging before re-raising exceptions.
1ede3be to
efcbcb7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (1)
exports/services/export.py (1)
127-128: Add safety check for empty export tables list.Similar to the previous comment, accessing the first element without checking if the list is empty could cause an IndexError.
source_cohorts_ids = [t.cohort_result_source_id for t in export.export_tables.all()] + if not source_cohorts_ids: + raise ValueError("No source cohorts found for export tables") main_cohort = CohortResult.objects.filter(uuid=source_cohorts_ids[0]).only("group_id", "name").first()
🧹 Nitpick comments (3)
exports/export_tasks/create_cohorts.py (1)
25-30: Consider URL-encoding the keys as well for robustness.While the current keys don't contain special characters, it's safer to encode both keys and values to handle any future changes.
def get_encoded_doc_ref_filter() -> str: filter_values = {"type:not": "https://terminology.eds.aphp.fr/aphp-orbis-document-textuel-hospitalier|doc-impor", "contenttype": "text/plain" } - return "&".join([f"{key}={quote_plus(val)}" for key, val in filter_values.items()]) + return "&".join([f"{quote_plus(key)}={quote_plus(val)}" for key, val in filter_values.items()])exports/tests/test_export_tasks.py (1)
194-194: Consider making the notification count assertion more explicit.The comment mentions notifications for owner and admins, but the assertion just checks count=2. Consider asserting the specific recipients or adding constants for clarity.
- self.assertEqual(mock_push_failure_notif.call_count, 2) # notif for owner and for admins + # Notifications should be sent to both owner and admins + EXPECTED_FAILURE_NOTIFICATIONS = 2 # 1 for owner, 1 for admins + self.assertEqual(mock_push_failure_notif.call_count, EXPECTED_FAILURE_NOTIFICATIONS)exports/services/export.py (1)
68-74: Consider preserving specific exception types in validation.The broad
Exceptioncatch might hide important error details. Consider catching and re-raisingValidationErrorwhile wrapping other exceptions.def validate(self, data: Dict, **kwargs) -> None: exporter = self._get_exporter(data.get("output_format")) try: exporter().validate(export_data=data, **kwargs) + except ValidationError: + raise except Exception as e: raise ValidationError(f'Invalid export data: {e}')
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (45)
cohort/services/cohort_result.py(5 hunks)cohort/services/dated_measure.py(2 hunks)cohort/services/feasibility_study.py(2 hunks)cohort/tests/test_service_cohort_result.py(1 hunks)cohort/tests/tests_view_cohort_result.py(1 hunks)cohort/tests/tests_view_dated_measure.py(1 hunks)cohort/views/cohort_result.py(2 hunks)cohort/views/dated_measure.py(2 hunks)cohort/views/feasibility_study.py(2 hunks)exporters/enums.py(2 hunks)exporters/exceptions.py(1 hunks)exporters/exporters/base_exporter.py(3 hunks)exporters/exporters/csv_exporter.py(1 hunks)exporters/exporters/hive_exporter.py(3 hunks)exporters/exporters/xlsx_exporter.py(1 hunks)exporters/notifications.py(4 hunks)exporters/tasks.py(0 hunks)exporters/tests/test_csv_exporter.py(0 hunks)exporters/tests/test_hive_exporter.py(3 hunks)exporters/tests/test_notifying_tasks.py(1 hunks)exporters/tests/test_xlsx_exporter.py(0 hunks)exports/export_tasks/__init__.py(1 hunks)exports/export_tasks/await_cohorts.py(1 hunks)exports/export_tasks/create_cohorts.py(1 hunks)exports/export_tasks/create_tables.py(1 hunks)exports/export_tasks/finalize.py(1 hunks)exports/export_tasks/logs.py(1 hunks)exports/export_tasks/notify_failed.py(1 hunks)exports/export_tasks/notify_received.py(1 hunks)exports/export_tasks/notify_succeeded.py(1 hunks)exports/export_tasks/prep.py(1 hunks)exports/export_tasks/send.py(1 hunks)exports/export_tasks/track.py(1 hunks)exports/models/datalab.py(1 hunks)exports/services/export.py(1 hunks)exports/services/export_operators.py(0 hunks)exports/tasks.py(1 hunks)exports/tests/test_export_manager.py(0 hunks)exports/tests/test_export_tasks.py(1 hunks)exports/tests/test_exporters_loader.py(1 hunks)exports/tests/test_service_export.py(0 hunks)exports/tests/test_view_datalab.py(1 hunks)exports/tests/test_view_export.py(1 hunks)exports/tools.py(1 hunks)exports/views/export.py(3 hunks)
💤 Files with no reviewable changes (6)
- exporters/tests/test_csv_exporter.py
- exports/services/export_operators.py
- exporters/tests/test_xlsx_exporter.py
- exports/tests/test_service_export.py
- exporters/tasks.py
- exports/tests/test_export_manager.py
✅ Files skipped from review due to trivial changes (5)
- exports/tasks.py
- cohort/views/feasibility_study.py
- exporters/exceptions.py
- exports/export_tasks/init.py
- exports/tools.py
🚧 Files skipped from review as they are similar to previous changes (29)
- cohort/tests/test_service_cohort_result.py
- exporters/exporters/xlsx_exporter.py
- exporters/notifications.py
- exporters/exporters/csv_exporter.py
- exports/models/datalab.py
- cohort/views/cohort_result.py
- exports/export_tasks/prep.py
- cohort/views/dated_measure.py
- exports/views/export.py
- exports/export_tasks/notify_received.py
- exports/tests/test_view_export.py
- exports/export_tasks/track.py
- exports/tests/test_exporters_loader.py
- exports/export_tasks/send.py
- exports/export_tasks/finalize.py
- exports/tests/test_view_datalab.py
- exporters/enums.py
- exports/export_tasks/notify_failed.py
- exports/export_tasks/logs.py
- cohort/services/feasibility_study.py
- exports/export_tasks/create_tables.py
- cohort/tests/tests_view_dated_measure.py
- cohort/tests/tests_view_cohort_result.py
- exports/export_tasks/notify_succeeded.py
- exports/export_tasks/await_cohorts.py
- exporters/tests/test_notifying_tasks.py
- exporters/exporters/base_exporter.py
- cohort/services/dated_measure.py
- cohort/services/cohort_result.py
🧰 Additional context used
🧬 Code Graph Analysis (3)
exporters/tests/test_hive_exporter.py (1)
exporters/exceptions.py (2)
HiveDBOwnershipException(9-10)CreateHiveDBException(5-6)
exporters/exporters/hive_exporter.py (5)
exporters/exceptions.py (3)
InvalidJobId(1-2)CreateHiveDBException(5-6)HiveDBOwnershipException(9-10)exporters/enums.py (2)
ExportTypes(6-22)APIJobType(25-27)exporters/exporters/base_exporter.py (2)
log_export_task(115-116)track_job(92-112)admin_cohort/types.py (1)
JobStatus(35-53)exports/models/export.py (1)
Export(16-46)
exports/export_tasks/create_cohorts.py (5)
cohort/models/cohort_result.py (1)
CohortResult(9-21)cohort/models/fhir_filter.py (1)
FhirFilter(9-26)exports/tools.py (1)
get_export_by_id(7-11)exports/models/export.py (1)
Export(16-46)cohort/services/cohort_result.py (2)
create_cohort_subset(39-71)handle_cohort_creation(82-105)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: test
🔇 Additional comments (2)
exporters/tests/test_hive_exporter.py (1)
5-5: Good improvement in error handling specificity!The changes from generic
RequestExceptionto specific exceptions (CreateHiveDBExceptionandHiveDBOwnershipException) improve error handling clarity and make the tests more precise about what type of failures they're testing.Also applies to: 71-72, 81-82
exporters/exporters/hive_exporter.py (1)
66-76: Excellent improvements to error handling and job tracking!The changes properly save the job ID for tracking, provide specific exception handling, and ensure the Hive DB is successfully created before proceeding. The error messages are also descriptive and helpful for debugging.
b7dceb2 to
e2e0a26
Compare
|



Summary by CodeRabbit
New Features
Bug Fixes
Refactor
Tests
Chores