mirror of
https://github.com/hibiken/asynq.git
synced 2026-04-26 15:35:55 +03:00
[GH-ISSUE #717] [QUESTION] Could multiple worker instances cause task duplication in the queue due to the recovery mechanism? #1363
Labels
No labels
CLI
bug
designing
documentation
duplicate
enhancement
good first issue
good first issue
help wanted
idea
invalid
investigate
needs-more-info
performance
pr-welcome
pull-request
question
wontfix
work in progress
work in progress
work-around-available
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/asynq#1363
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @WSUFan on GitHub (Aug 6, 2023).
Original GitHub issue: https://github.com/hibiken/asynq/issues/717
Hi @hibiken. I have a follow-up question regarding running multiple instances of workers that serve the same queue. As we have a recovery mechanism now, could there be a possibility that several recovery processes pick up and re-enqueue the same lease expired task multiple times (so the queue will have multiple entries of the same task ID)?
@hibiken commented on GitHub (Aug 6, 2023):
@WSUFan thanks for pointing this out.
I believe you're right, this needs to be fixed.
Approaches I can think of:
listLeaseExpiredcommand (lock with TTL to avoid unfortunate situation)listLeaseExpired, we could move these expired task ids to "stagin" area so that they are not in the asynq:{}:lease anymoreAny feedback is appreicated.
@WSUFan commented on GitHub (Aug 6, 2023):
yeah, maybe a redis lock is needed when operating with these keys. I would choose approach 1, haha
@yousifh commented on GitHub (Aug 7, 2023):
While there is a race condition, is there an actual risk of task duplication?
Take 2 workers W1 and W2 and an expired task T1 and consider this timeline of sequential events
ListLeaseExpiredand gets T1ListLeaseExpiredand gets T1retryIt removes T1 fromasynq:{<qname>}:activeandasynq:{<qname>}:leaseand moves it toasynq:{<qname>}:retryretrybut it will fail because callingLREMonasynq:{<qname>}:activereturn 0 elements.Now of course both W1 and W2 having the same expired tasks is not ideal and there is a race condition where in the last step above, W2 might
LREMtask T1 after it has been re-enqueued again.So I think your proposal of a Redis lock around
listLeaseExpiredis good solution so each worker is dealing with disjoint set of expired tasks@WSUFan commented on GitHub (Aug 7, 2023):
I believe another solution would be to have the client side manage this. We could execute the
listLeaseExpired()function on the client side and retrieve the task initiated by this client. In this scenario, there would only be one retry (re-enqueuing the same task) from this client, assuming each task has a unique ID.