Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Corrected: Improved AQUA Error Messages for Authorization and Tag-Related Uses #1141

Merged
merged 10 commits into from
Apr 3, 2025

Conversation

elizjo
Copy link
Member

@elizjo elizjo commented Mar 31, 2025

Currently, the AQUA UI throws generic errors when auth issues arise- not informing the user of potential solutions and suggested policy additions to fix the error.

Added 'troubleshooting_tips' section (see example json below)

  • links to the specific section (as specified by the operation error from the OCI Service Error) to our AQUA troubleshooting page
    result : user will obtain a specific action to fix their auth issue

Distinguish between Auth error versus tag related error.

  • certain operations have two methods for failure- auth related and tag related. We use the OCI Service Error to single out 'Invalid tag' errors for the create_model and update_model operations
    result: user obtains a different fix for a tag-related error vs. an auth error

Refactored write_error in used in base_handler.py and WS handlers to utils.py (basically common code between these handlers were stuck into a single file)

Passed all test cases

{
  "status": 404,
  "troubleshooting_tips": "Unable to list model deployments. See tips for troubleshooting: https://github.com/oracle-samples/oci-data-science-ai-samples/blob/main/ai-quick-actions/troubleshooting-tips.md#list-model-deployments",
  "message": "Authorization Failed: The resource you're looking for isn't accessible. Operation Name: list_model_deployments.",
  "service_payload": {
    "target_service": "data_science",
    "status": 404,

@oracle-contributor-agreement oracle-contributor-agreement bot added the OCA Verified All contributors have signed the Oracle Contributor Agreement. label Mar 31, 2025
Copy link

📌 Cov diff with main:

Coverage-0%

📌 Overall coverage:

Coverage-19.29%

Copy link
Member

@VipulMascarenhas VipulMascarenhas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's one test failing in test_model_handler.py, can you please fix it?


if operation_name:
if operation_name.startswith("create"):
return f"{messages['create']} Operation Name: {operation_name}."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to check the status code before determining if this is an auth failure. If we give wrong input parameters, create can fail. Usually the wrong param happens because user have selected a shape which is not GA in the chosen region

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed

return tip


def construct_error(status_code, **kwargs) -> tuple[dict[str, int], str, str]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

status_code missing type

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed

return f"https://github.com/oracle-samples/oci-data-science-ai-samples/blob/main/ai-quick-actions/troubleshooting-tips.md#{github_header}"


def get_troubleshooting_tips(service_payload: str,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

payload is Dict?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed

"message": message,
"service_payload": service_payload,
"reason": reason,
"request_id": str(uuid.uuid4()),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does the request id help here?

@mrDzurb @VipulMascarenhas we should consider tracing our APIs so that we can use this request id for checking in lumberjack.

This is for later. For this PR we can leave this as is

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about if this request_id is used somewhere? Maybe we use it somehow on UI side?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the request id is logged in lumberjack with the request id, and we print it in the local notebook logs for errors or when CLI is used - but request id is not shown in the UI yet. We'll need a UI change to show this alongside the error message.

Copy link

github-actions bot commented Apr 2, 2025

📌 Cov diff with main:

Coverage-97%

📌 Overall coverage:

Coverage-58.82%

"message": message,
"service_payload": service_payload,
"reason": reason,
"request_id": str(uuid.uuid4()),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about if this request_id is used somewhere? Maybe we use it somehow on UI side?

"put_object": "Unable to access or find Object Storage Bucket. See tips for troubleshooting: ",
"list_model_version_sets": "Unable to create or fetch model version set. See tips for troubleshooting:",
"update_model": "Unable to update model. See tips for troubleshooting: ",
"list_data_science_private_endpoints": "Unable to access specified Object Storage Bucket. See tips for troubleshooting: ",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it a correct message for this key?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed

"list_model_version_sets": "Unable to create or fetch model version set. See tips for troubleshooting:",
"update_model": "Unable to update model. See tips for troubleshooting: ",
"list_data_science_private_endpoints": "Unable to access specified Object Storage Bucket. See tips for troubleshooting: ",
"create_model" : "Unable to register or create model. See tips for troubleshooting: ",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should stick with “register” here, using “create” might imply that registering and creating are two separate operations, which could be confusing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed

"create_job_run": "Unable to create job run. See tips for troubleshooting: ",
}

ERROR_MESSAGES = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: the ERROR_MESSAGES might be to generic name for this. How aboutL STATUS_CODE_MESSAGE?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed

return tip


def construct_error(status_code: int, **kwargs) -> tuple[dict[str, int], str, str]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can consider to use pydantic class instad returning a tuple?

Something like:

class ErrorResponse(Serializable):
    """Structured error response returned to the client."""
    status: int = Field(..., description="HTTP status code.")
    message: str = Field(..., description="Error message.")
    request_id: str = Field(..., description="Unique ID for tracking the error.")
    reason: Optional[str] = Field(None, description="Reason for the error.")
    service_payload: Optional[Dict[str, Any]] = Field(default_factory=dict)
    troubleshooting_tips: Optional[List[str]] = Field(default_factory=list)

And

def construct_error(status_code: int, **kwargs) -> ErrorResponse:
     ....

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added ReplyDetails Pydantic class

Copy link

github-actions bot commented Apr 2, 2025

📌 Cov diff with main:

Coverage-97%

📌 Overall coverage:

Coverage-58.82%

Copy link

github-actions bot commented Apr 2, 2025

📌 Cov diff with main:

Coverage-97%

📌 Overall coverage:

Coverage-58.83%

mrDzurb
mrDzurb previously approved these changes Apr 2, 2025
- message (str, optional): A custom error message, from error raised from failed AQUA methods calling OCI SDK methods
- exc_info (tuple, optional): Exception information (e.g., from `sys.exc_info()`), used for logging.

Returns:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: The docstring I guess needs to be changed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed

Copy link

github-actions bot commented Apr 2, 2025

📌 Cov diff with main:

Coverage-98%

📌 Overall coverage:

Coverage-56.96%

Copy link

github-actions bot commented Apr 3, 2025

📌 Cov diff with main:

Coverage-98%

📌 Overall coverage:

Coverage-58.72%

@mrDzurb mrDzurb merged commit 0ada1cc into main Apr 3, 2025
22 of 23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
OCA Verified All contributors have signed the Oracle Contributor Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants