-
-
Notifications
You must be signed in to change notification settings - Fork 949
ETA tasks lost on worker TERM restart #533
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Do you have any steps to reproduce this? What does the worker log say at shutdown? |
I have around 20 instances running as a cluster, consume from a redis server located in a separate instance. I use saltstack to deploy things on these 20 instances. Will be doing worker restart as well along with deploys. This is an instance celery log and it seems the instances do not sync each other since worker restart happenig in same point in time. Problem found is that when we restart worker during execution of a task, all the scheduled task will be lost from the queue I use supervisor as daemon process for worker process For example. Test1 and Test2 are two consumer instances of a queue. i have added two tasks, T1(countdown = 60), T2(countdown = 86400). I just restarted the worker using saltstack, now both the tasks are there in Test2 worker: Warm shutdown (MainProcess) The pickle serializer is a security concern as it may give attackers If you depend on pickle then you should set a setting to disable this CELERY_ACCEPT_CONTENT = ['pickle', 'json', 'msgpack', 'yaml'] You must only enable the serializers that you will actually use. warnings.warn(CDeprecationWarning(W_PICKLE_DEPRECATED)) i restarted the supervisor when a task T1 (scheduled at 2015-10-27 11:26:19.881562+00:00 ) is running The running task finishes the execution as i see my database since it does some updates in my tables , and the task T2 (scheduled to run at 2016-02-04 12:25:58.399643+00:00) is not found in the queue (not restored by either Test1 or Test2) |
Start the worker with |
Same here in 2019 |
Having this issue with RabbitMQ as well |
celery n kombu version? |
Also having this issue with a task ETA that is scheduled to occur after a proc restart
|
@auvipy so just to check, is this solved in some version? Or is it an ongoing issue? |
I don't know! need some one to verify in production using the latest development branch. but most probably no |
I’m observing the same issue in production with a very simple setup: Single worker, single task, redis backend, unlimited auto-retry, acks_late. Versions are: kombu==5.2.4 kombu said
in the logs. But the Redis DB is empty now. |
I can reproduce this very easily in my local dev setup (vagrant box) and might help to track it down.
In another shell I restart the worker service and the task is gone.
Here is the output of the celery worker service with debug log-level: systemctl-debug.txt |
Here is the service file used to launch the worker logcrm-event-bus.service. Maybe that’s where the problem lies. |
I’ve tried reproducing this with the same setup but using RabbitMQ and the problem did not occur. Note that the service shuts down successfully in this case too (with the same service file). |
I have clusters of workers listening to a redis server. When i do restart worker(using salt stack), the running tasks will comple execution since its a TERM signal, but all the scheduled tasks (ETA set) are lost
The text was updated successfully, but these errors were encountered: