Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DecodeError retrieving data from monitor api #40228

Open
rcampos87 opened this issue Mar 26, 2025 · 13 comments
Open

DecodeError retrieving data from monitor api #40228

rcampos87 opened this issue Mar 26, 2025 · 13 comments
Assignees
Labels
customer-reported Issues that are reported by GitHub users external to the Azure organization. Mgmt This issue is related to a management-plane library. Monitor Monitor, Monitor Ingestion, Monitor Query needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team question The issue doesn't require a change to the product in order to be resolved. Most issues start as that Service Attention Workflow: This issue is responsible by Azure service team.

Comments

@rcampos87
Copy link

  • Package Name: azure-mgmt-monitor
  • Package Version: 6.0.2
  • Operating System: Debian bullseye
  • Python Version: 3.10

Describe the bug
In the past few days I have been seeing a lot of DecodeError from the sdk when trying to ingest monitor data from VMs.

As far as I know, it happens randomly when the request has a high cost.

JSON is invalid: Unterminated string starting at: line 1 column 23743 (char 23742)
Content: {"cost":968,"timespan":"2025-03-21T00:05:00Z/2025-03-21T04:08:14Z","interval":"PT1M","value":[{"id":"/subscriptions/5b7974bc-b401-4d16-98f8-90ba97ca232d/resourceGroups/USP-APP-RG/providers/Microsoft.Compute/virtualMachines/VR2VDHCP1/providers/Microsoft.Insights/metrics/Percentage CPU","type":"Microsoft.Insights/metrics","name":{"value":"Percentage CPU","localizedValue":"Percentage CPU"},"displayDescription":"The percentage of allocated compute units that are currently in use by the Virtual Machine(s)","unit":"Percent","timeseries":[{"metadatavalues":[],"data":[{"timeStamp":"2025-03-21T00:05:00Z","average":4.325},{"timeStamp":"2025-03-21T00:06:00Z","average":2.805},{"timeStamp":"2025-03-21T00:07:00Z","average":15.38},{"timeStamp":"2025-03-21T00:08:00Z","average":2.905},{"timeStamp":"2025-03-21T00:09:00Z","average":3.6},{"timeStamp":"2025-03-21T00:10:00Z","average":5.1},{"timeStamp":"2025-03-21T00:11:00Z","average":3.26},{"timeStamp":"2025-03-21T00:12:00Z","average":2.51},{"timeStamp":"2025-03-21T00:13:00Z","average":2.675},{"timeStamp":"2025-03-21T00:14:00Z","average":2.38},{"timeStamp":"2025-03-21T00:15:00Z","average":7.205},{"timeStamp":"2025-03-21T00:16:00Z","average":3.89},{"timeStamp":"2025-03-21T00:17:00Z","average":10.82},{"timeStamp":"2025-03-21T00:18:00Z","average":4.015},{"timeStamp":"2025-03-21T00:19:00Z","average":7.37},{"timeStamp":"2025-03-21T00:20:00Z","average":11.765},{"timeStamp":"2025-03-21T00:21:00Z","average":3.125},{"timeStamp":"2025-03-21T00:22:00Z","average":3.055},{"timeStamp":"2025-03-21T00:23:00Z","average":3.025},{"timeStamp":"2025-03-21T00:24:00Z","average":2.115},{"timeStamp":"2025-03-21T00:25:00Z","average":3.775},{"timeStamp":"2025-03-21T00:26:00Z","average":5.24},{"timeStamp":"2025-03-21T00:27:00Z","average":9.745},{"timeStamp":"2025-03-21T00:28:00Z","average":3.11},{"timeStamp":"2025-03-21T00:29:00Z","average":3.075},{"timeStamp":"2025-03-21T00:30:00Z","average":3.825},{"timeStamp":"2025-03

Others json is invalid include:

JSON is invalid: Expecting ':' delimiter: line 1 column 23750 (char 23749)
JSON is invalid: Expecting value: line 1 column 23749 (char 23748)

To Reproduce
It occurs randomly.

Expected behavior
Either return a proper response or none at all.

Additional context
This error has been caught by NewRelic, it happens at

…usr/local/lib/python3.10/site-packages/azure/mgmt/monitor/v2021_05_01/operations/_metrics_operations.py, line 801, in list
                               File /usr/local/lib/python3.10/site-packages/azure/core/pipeline/_base.py, line 240, in run
                               File /usr/local/lib/python3.10/site-packages/azure/core/pipeline/_base.py, line 96, in send
                               File /usr/local/lib/python3.10/site-packages/azure/core/pipeline/_base.py, line 96, in send
                               File /usr/local/lib/python3.10/site-packages/azure/core/pipeline/_base.py, line 96, in send
                               File /usr/local/lib/python3.10/site-packages/azure/core/pipeline/_base.py, line 96, in send
                               File /usr/local/lib/python3.10/site-packages/azure/core/pipeline/_base.py, line 100, in send
                              File /usr/local/lib/python3.10/site-packages/azure/core/pipeline/_tools.py, line 49, in await_result
                 File /usr/local/lib/python3.10/site-packages/azure/core/pipeline/policies/_universal.py, line 714, in on_response
                 File /usr/local/lib/python3.10/site-packages/azure/core/pipeline/policies/_universal.py, line 681, in deserialize_from_http_generics
                 File /usr/local/lib/python3.10/site-packages/azure/core/pipeline/policies/_universal.py, line 611, in deserialize_from_text
@github-actions github-actions bot added customer-reported Issues that are reported by GitHub users external to the Azure organization. Mgmt This issue is related to a management-plane library. Monitor Monitor, Monitor Ingestion, Monitor Query labels Mar 26, 2025
Copy link

Hello @rcampos87. I'm an AI assistant for the azure-sdk-for-python repository. I have some suggestions that you can try out while the team gets back to you.

• Ensure you’re using the latest version of the library in case this is a bug that’s already been addressed. See the azure-mgmt-monitor README for upgrade instructions.

• If the error occurs when cost is high (resulting in larger payloads), consider adding retry logic or inspecting network conditions—malformed or truncated JSON may indicate issues with the service response.

• Compare this behavior with similar issues in the repository (for example, issues #11271 and #29570) to see if there’s a pattern or if any workarounds have been suggested.

• If the issue persists, consider capturing complete request/response traces to help with debugging or file a new issue with detailed steps and logs.

The team will get back to you shortly, hopefully this helps in the meantime.

@github-actions github-actions bot added needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team question The issue doesn't require a change to the product in order to be resolved. Most issues start as that labels Mar 26, 2025
@pvaneck pvaneck assigned msyyc and unassigned pvaneck Mar 26, 2025
@pvaneck pvaneck added the Service Attention Workflow: This issue is responsible by Azure service team. label Mar 26, 2025
Copy link

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @gulopesd @Haiying-MSFT @jairmyree @joshfree @KarishmaGhiya @KevinBlasko @kurtzeborn @nisha-bhatia @pvaneck @scottaddie @srnagar @ToddKingMSFT.

@msyyc
Copy link
Member

msyyc commented Mar 27, 2025

@ChenxiJiang333 Please help make an investigation for this issue.

@ChenxiJiang333
Copy link
Member

got it

@ChenxiJiang333
Copy link
Member

Hi @rcampos87, it seems some json data returned by the service doesn't respect the right schema.

Since the DecodeError happens often, is that possible for you to share part of the log where you meet these errors to help us locate the invalid json data?

You could save the whole output of SDK in a .txt file by copying the following code in your .py file:

import logging, os
logging.basicConfig(level=logging.DEBUG,
                    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
                    filename=f"{os.path.splitext(os.path.basename(__file__))[0]}.log",
                    filemode="w")

And set logging_enable=True when calling operation:

client.operation_group.operation(..., logging_enable=True)

Please also remember to conceal the sensitive info in your log.

@rcampos87
Copy link
Author

@ChenxiJiang333 just to confirm, logging_enable=True is passed to client.metrics.list? I enabled it but I dont see any actual logs from azure in the file

@ChenxiJiang333
Copy link
Member

@ChenxiJiang333 just to confirm, logging_enable=True is passed to client.metrics.list? I enabled it but I dont see any actual logs from azure in the file

Hi @rcampos87, yes, and even if logging_enable=True is not passed, you would still be able to see the log but sanitized contents. Could you update the code as below to check whether the logs could be found in the console?

import logging, os
format = logging.Formatter("%(asctime)s - %(name)s - %(levelname)s - %(message)s")
file_handler = logging.FileHandler(f"{os.path.splitext(os.path.basename(__file__))[0]}.log")
file_handler.setLevel(logging.DEBUG)
file_handler.setFormatter(format)
console_handler = logging.StreamHandler()
console_handler.setLevel(logging.DEBUG)
console_handler.setFormatter(format)
logging.basicConfig(level=logging.DEBUG, handlers={file_handler, console_handler})

@rcampos87
Copy link
Author

Ok, I will update the code, it might be a few days before it errors again, lets see. thanks @ChenxiJiang333 I will get back to you.

@ChenxiJiang333
Copy link
Member

Ok, I will update the code, it might be a few days before it errors again, lets see. thanks @ChenxiJiang333 I will get back to you.

I think it could be quick to check whether the logs could be printed by these settings, not need to wait for the error message?

@rcampos87
Copy link
Author

Yep, In the logs I can see so far, for example, is:

[2025-04-01 12:13:51,560: INFO/ForkPoolWorker-9] Request URL: 'https://login.microsoftonline.com/a3ecbd48-a972-4e71-bae3-1339ccdfdd34/oauth2/v2.0/token'
Request method: 'POST'
Request headers:
    'Accept': 'application/json'
    'x-client-sku': 'REDACTED'
    'x-client-ver': 'REDACTED'
    'x-client-os': 'REDACTED'
    'x-ms-lib-capability': 'REDACTED'
    'client-request-id': 'REDACTED'
    'x-client-current-telemetry': 'REDACTED'
    'x-client-last-telemetry': 'REDACTED'
    'User-Agent': 'azsdk-python-identity/1.13.0 Python/3.10.16 (Linux-6.8.0-1015-aws-x86_64-with-glibc2.31)'
A body is sent with the request
[2025-04-01 12:13:51,766: INFO/ForkPoolWorker-9] Response status: 200
Response headers:
    'Cache-Control': 'no-store, no-cache'
    'Pragma': 'no-cache'
    'Content-Type': 'application/json; charset=utf-8'
    'Expires': '-1'
    'Strict-Transport-Security': 'REDACTED'
    'X-Content-Type-Options': 'REDACTED'
    'P3P': 'REDACTED'
    'client-request-id': 'REDACTED'
    'x-ms-request-id': '1d62fb74-83c3-44ec-b45c-b38a019f0500'
    'x-ms-ests-server': 'REDACTED'
    'x-ms-clitelem': 'REDACTED'
    'x-ms-srs': 'REDACTED'
    'Content-Security-Policy-Report-Only': 'REDACTED'
    'X-XSS-Protection': 'REDACTED'
    'Set-Cookie': 'REDACTED'
    'Date': 'Tue, 01 Apr 2025 12:13:51 GMT'
    'Content-Length': '1397'
[2025-04-01 12:13:51,767: INFO/ForkPoolWorker-9] ClientSecretCredential.get_token succeeded
[2025-04-01 12:13:51,768: INFO/ForkPoolWorker-9] Request URL: 'https://management.azure.com/subscriptions/02d6e9ea-7639-442f-9f40-fea7a38050b4/resourceGroups/USZE1TOPSP-RG/providers/Microsoft.Compute/virtualMachines/usze1topsp01/providers/Microsoft.Insights/metrics?timespan=REDACTED&interval=REDACTED&metricnames=REDACTED&aggregation=REDACTED&orderby=REDACTED&api-version=REDACTED&metricnamespace=REDACTED'
Request method: 'GET'
Request headers:
    'Accept': 'application/json'
    'x-ms-client-request-id': 'c5c63000-0ef2-11f0-a51b-0242ac150004'
    'User-Agent': 'azsdk-python-azure-mgmt-monitor/6.0.2 Python/3.10.16 (Linux-6.8.0-1015-aws-x86_64-with-glibc2.31)'
    'Authorization': 'REDACTED'
No body was attached to the request
[2025-04-01 12:13:52,086: INFO/ForkPoolWorker-9] Response status: 200
Response headers:
    'Cache-Control': 'no-cache'
    'Pragma': 'no-cache'
    'Content-Length': '60175'
    'Content-Type': 'application/json; charset=utf-8'
    'Expires': '-1'
    'x-ms-ratelimit-remaining-subscription-resource-requests': '999'
    'x-ms-operation-identifier': 'REDACTED'
    'Request-Context': 'REDACTED'
    'x-ms-request-id': 'bb532d61-4049-4abb-be17-35fe69b1bfb8'
    'Strict-Transport-Security': 'REDACTED'
    'x-ms-correlation-request-id': 'REDACTED'
    'x-ms-routing-request-id': 'REDACTED'
    'X-Content-Type-Options': 'REDACTED'
    'X-Cache': 'REDACTED'
    'X-MSEdge-Ref': 'Ref A: 72D03C790DD44EC992DE4194121FD062 Ref B: MWH011020807023 Ref C: 2025-04-01T12:13:51Z'
    'Date': 'Tue, 01 Apr 2025 12:13:52 GMT'

but no other debug logs coming from the sdk, is it expected? will it show more on error?

@ChenxiJiang333
Copy link
Member

ChenxiJiang333 commented Apr 2, 2025

Hi @rcampos87, I saw a request has been made to get metrics in the log, then there should be response body printed next to the response headers. Did you mean you didn't find any response body in the log?

Image
Also, I found the request body seems to be sanitized, which is strange. Maybe try passing logging_enable=True to the client to see if there's any difference...

monitor_client = MonitorManagementClient(credentials, SUBSCRIPTION_ID, logging_enable=True)

@rcampos87
Copy link
Author

Hi @ChenxiJiang333 I added logging_enable=True to client but there was no difference.

@ChenxiJiang333
Copy link
Member

Hi @ChenxiJiang333 I added logging_enable=True to client but there was no difference.

could you see the response content in the log, for example:

Image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
customer-reported Issues that are reported by GitHub users external to the Azure organization. Mgmt This issue is related to a management-plane library. Monitor Monitor, Monitor Ingestion, Monitor Query needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team question The issue doesn't require a change to the product in order to be resolved. Most issues start as that Service Attention Workflow: This issue is responsible by Azure service team.
Projects
None yet
Development

No branches or pull requests

4 participants