Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changefeed leases got stuck, no exception was thrown #5025

Open
ShiminSu opened this issue Feb 19, 2025 · 1 comment
Open

Changefeed leases got stuck, no exception was thrown #5025

ShiminSu opened this issue Feb 19, 2025 · 1 comment
Labels
customer-reported Issue created by a customer needs-investigation

Comments

@ShiminSu
Copy link

We are continuously addressing and improving the SDK, if possible, make sure the problem persist in the latest SDK version.

Describe the bug
Recently I've encounted 3 times where different changefeed leases was stuck, with no exceptions and when I've rebuildt the code or even redeployed the code with a new instanceName it was still stuck. For context I've found out it was stuck using ChangeFeedEstimator.
Looking into cosmos db leases container for the specific lease, I have noticed that whenever I start the code the timestamp does update, and for a few brief seconds the Owner of the lease is set to the correct Instance, however very soon the Owner will be set to null. It seems to me that the instance lost the Ownership of the lease without throwing any error?
The program is running as an app service, and to resolve the issue I've tried to restart it, redeploy the same code both did not work. However deploying the code even with minor changes does resolve the issue and the leases will be unstuck.
I've also checked the CPU and memory etc. related to the said app service, all are fine and seems to be unrelated.

To Reproduce
I have tried a lot to reproduce but so far no luck, we run multiple leases in the app and it seems to me random when any will be stuck.
Code for the changefeed is as follows. For context it's just running in an IHostedService, nothing home made.

Image

Expected behavior
I expected the change feed to pick up where it left off last time using the same changeFeedProcessName with a new changeFeedInstanceName, and it should pull 1 item at time and process it through ProcessChanges

Actual behavior
The ChangeFeedEstimator will tell me that there's x-amount of events left to process but ProcessChanges method is not running

Environment summary
SDK Version: 3.47.0
OS Version: windows

Additional context
I've made a seperate process to just monitor each active leases by fetching all leases with timestamp > today and Owner=null, since the timestamp does get updated and whenever that happens I expect Owner to be there. From there I just use changeFeedEstimator and give my self a msg if any is greater than 5 since I expect the changeFeed most of the time to be empty.

@ShiminSu
Copy link
Author

Just want to clearify there's nothing wrong with execution of the elements in que, they all just went ok after som redeploy attempts. I'm gonna put some lifecycle handler around lease handling to see if I can get some more details

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
customer-reported Issue created by a customer needs-investigation
Projects
None yet
Development

No branches or pull requests

1 participant