mirror of
https://github.com/hibiken/asynq.git
synced 2026-04-25 23:15:51 +03:00
[GH-ISSUE #329] [Question] Are there TTL set for tasks? How long does it stay in Redis? #2168
Labels
No labels
CLI
bug
designing
documentation
duplicate
enhancement
good first issue
good first issue
help wanted
idea
invalid
investigate
needs-more-info
performance
pr-welcome
pull-request
question
wontfix
work in progress
work in progress
work-around-available
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/asynq#2168
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @seanyu4296 on GitHub (Sep 23, 2021).
Original GitHub issue: https://github.com/hibiken/asynq/issues/329
Originally assigned to: @hibiken on GitHub.
Asking because i cannot find this in the wiki and others might find it useful.
Specifically, How long do archived tasks stay in Redis so we can keep track or retry manually?
@hibiken commented on GitHub (Sep 23, 2021):
@seanyu4296 Thank you for the question! I should document this somewhere.
Currently these are defined here as constants:
github.com/hibiken/asynq@b3ef9e91a9/internal/rdb/rdb.go (L528-L531)In other words, tasks archived 90days ago (or older) will be deleted. If the archive size reaches 10,000, Asynq will start deleting oldest tasks to keep the size at 10,000 or smaller (even if the oldest tasks are less than 90d old).
Let me know if you have more questions on this :)
@seanyu4296 commented on GitHub (Sep 24, 2021):
Thank you again so much for the swift and detailed response @hibiken
From the code now, there is no way to override it right? correct me if I'm wrong?
@seanyu4296 commented on GitHub (Sep 24, 2021):
Another follow up question how often does the worker poll to get new tasks mainly for "scheduled" tasks? I'm looking through the code now and i see there is a process to move from scheduled -> pending in forward.go? What's next after that?
Is the execution accurate down to the Milliseconds?
Let's say I enqueued a task at 00:00:00.000 and scheduled it to process after a minute
asynq.enqueue(..., asynq.ProcessIn(time.Minute * 1)Is there a guarantee that it will execute after 00:01:00.000 ? or is there a possibility for it to run a bit early? because of some polling and zscore comparison?
@hibiken commented on GitHub (Sep 30, 2021):
@seanyu4296 Sorry for the delayed response.
Yes, you're right. Currently, user cannot configure
archiveMaxSizeandarchivedExpirationInDays. Let me know if you have a need to configure these.As for question #1, after task goes to
pendingit's FIFO from there so all available worker will look at the pending list and start processing in FIFO fashion.To answer question #2, no it's not. It's at granularity of seconds. It'll always be no earlier than the time specified by
ProcessInorProcessAt. It depends on how backed up the pending list is, and also cadence of forwarder (currently every 5s)These are all good questions, let me know if you have more!
@seanyu4296 commented on GitHub (Sep 30, 2021):
So far, there is no need but it would be nice to have.
Thanks for answering my questions. Sorry, also didn't get back here after I looked through the code. How I figured it out is starting in rdb.go and going up. (e.g. start with the function
dequeueand check where it is used)Anyway, I noticed now we use RPOPLPUSH instead of BRPOPLPUSH. What's the reason behind that? Is it because of lua script being blocking by default?
On Task Execution Timing
So, the
forwarderonly polls and forwards every 5 seconds code and theprocessor"pulls" every second depending on the pending tasks based from this code. Correct me if I am wrong? or if you have further thoughtsDoes that mean when I enqueue a task at 00:00:02.000 to be processed after a second and I started my asynq server at 00:00:00.000 the task will roughly get processed at 00:00:05.000?
Thanks so much too @hibiken for answering and sharing your knowledge! 🙏
@hibiken commented on GitHub (Oct 1, 2021):
Got it, yes we can make this configurable. I'll defer it for now, I don't want to add a feature not used by anyone :)
It's beacuase we want to consult multiple queues in order (based on priority) for any pending task but don't want to get blocked by an empty queue. For example, let's say our server is consuming tasks from Queue X, Y, and Z, and based on the priority configuration we should consult them in X, Y, Z order (i.e. priority level X > Y > Z).
If Queue X, Y are empty, but Queue Z has some pending tasks, we want to process those tasks. That's why we want the non-blocking
RPOPLPUSHinstead of the blocking variant. It'd be nice if Redis supported multiple sources forBRPOPLPUSHbut not the case right now: Related issue https://github.com/redis/redis/issues/1785This one second sleep only happens when all of the queues are empty. To use the previous example, when all Queue X, Y, Z are empty. Since we are basically polling redis for any pending tasks, if we don't sleep there, it'll be slamming redis in an infinite loop, adding unnecessary load to redis (it'll also be bad since other Lua scripts can't execute, since all redis command/scripts are execute in a single-threaded fashion)
github.com/hibiken/asynq@b3ef9e91a9/processor.go (L174-L174)The forwarder moves tasks from
scheduledorretrystate topendingin a batch fashion every 5s. So if a task A is enqueued withProcessAt(T1), it'll only get forwarded topendingstate after time T1. But that should happen anywhere between T1 and (T1 + 5s). So 5s is the maximum delay in terms of when the task becomes pending. Once a task is pending, it's a matter of how much backlog the queue has in the pending set. Ideally pending set is small so that as soon as tasks become pending, they get picked up by a worker (state becomes active), but if you have a large backlog in the pending set, then, since tasks are processed in FIFO order, it'll take some time before the task get picked up by a worker (Incidentally, we have a feature request to surface this queue "latency" in our tooling)If you haven't already, please take a look at this wiki: https://github.com/hibiken/asynq/wiki/Life-of-a-Task and suggest any modification or make edits if you find some information is missing there.
@hibiken commented on GitHub (Oct 12, 2021):
Closing this. Let me know if you have any more questions!
@csdenboer commented on GitHub (Oct 12, 2021):
First of all, thanks a lot for your work on asynq! I was searching for the reason why there's a lot of latency between a task being enqueued and being received by a worker. Apparently, this is necessary to support multiple (prioritized) queues. However, I think there are also many users that are just using a single queue (like me) and for those I believe the performance penalty is huge. Is there a way to use BRPOPLPUSH in case of only a single queue?
@hibiken commented on GitHub (Oct 12, 2021):
@csdenboer thanks for the comment and good observation!
I vaguely remember that we used to use BRPOPLPUSH if user only specified one queue in the Config. I'll look into to this and see if we can conditionally run BRPOPLPUSH in a single queue case.
Just to get more context, when are you noticing a high latency. Is it when enqueuing with either
ProcessInorProcessAtoption? Or without them?Off the top of my head, the delay should be no more than one second unless the queue has a huge backlog of tasks.
@csdenboer commented on GitHub (Oct 13, 2021):
Thanks! Yes, we are experiencing high latency when enqueuing without the ProcessIn or ProcessAt option. The delay seems to be at most 1 second indeed, but that is very significant for our use case… In my experience when using BRPOPLPUSH the latency should be somewhere in the order of milliseconds. It would be very nice if this can be resolved!
@xuyang2 commented on GitHub (Jul 13, 2022):
@hibiken It seems
archiveCmdonly sets the expiry time forKEYS[5]andKEYS[6], not the expiry time forKEYS[1].Is there any other cleanup routine for
KEYS[1] -> asynq:{<qname>}:t:<task_id>?github.com/hibiken/asynq@c70ff6a335/internal/rdb/rdb.go (L824-L867)@roccoblues commented on GitHub (Jul 26, 2022):
I found this thread because we actually need it. 😏 The default retention period is too long for the German data-privacy laws and we need to expire tasks sooner.
Would you be open for a PR that makes the two settings configurable?
@hibiken commented on GitHub (Jul 27, 2022):
@roccoblues Sure, I'm a little behind on PR reviews but I'll get to it when I have time.
Please take a look at CONTRIBUTION.md before creating a PR 👍 Thanks!