[GH-ISSUE #246] [FEATURE REQUEST] Tooling Improvements for v1 release #2116

Open
opened 2026-03-15 19:14:02 +03:00 by kerem · 9 comments
Owner

Originally created by @hibiken on GitHub (Feb 15, 2021).
Original GitHub issue: https://github.com/hibiken/asynq/issues/246

Originally assigned to: @hibiken on GitHub.

I'd like to see a couple more features added to Web UI and CLI, as well as the inspeq package to enhance the developer experience.

Features:

  • Find task by ID
  • [ ] Support filter (e.g. filter by task type, payload, etc)
  • Show latency of a queue (i.e. during the oldest task has been pending in a queue)
  • Show the breakdown of task types and their respective count in each queue
Originally created by @hibiken on GitHub (Feb 15, 2021). Original GitHub issue: https://github.com/hibiken/asynq/issues/246 Originally assigned to: @hibiken on GitHub. I'd like to see a couple more features added to Web UI and CLI, as well as the `inspeq` package to enhance the developer experience. Features: - [x] Find task by ID - <del> [ ] Support filter (e.g. filter by task type, payload, etc) - [x] Show latency of a queue (i.e. during the oldest task has been pending in a queue) - [ ] Show the breakdown of task types and their respective count in each queue
Author
Owner

@ajatprabha commented on GitHub (Sep 16, 2021):

Hi @hibiken, great project you got here! 🚀

I'm also looking to move from gocarft/work to asynq as the former has not been updated since 2018.

For alerts I'm interested in

  • Latency of a queue
  • Lock count of a queue (if such thing is present in asynq)

Any updates on this issue? If you have bandwidth issues, let me know, I'd be happy to help with contributions.

<!-- gh-comment-id:920776482 --> @ajatprabha commented on GitHub (Sep 16, 2021): Hi @hibiken, great project you got here! 🚀 I'm also looking to move from `gocarft/work` to `asynq` as the former has not been updated since 2018. For alerts I'm interested in > - Latency of a queue > - Lock count of a queue (if such thing is present in asynq) Any updates on this issue? If you have bandwidth issues, let me know, I'd be happy to help with contributions.
Author
Owner

@hibiken commented on GitHub (Sep 16, 2021):

@ajatprabha Thank you for the comment.

I'm currently working on a few things and I may be able to squeeze in the latency work in the next release.

I'm not sure what you meant by "Lock count of a queue". Could you explain in more details?

<!-- gh-comment-id:921055902 --> @hibiken commented on GitHub (Sep 16, 2021): @ajatprabha Thank you for the comment. I'm currently working on a few things and I may be able to squeeze in the latency work in the next release. I'm not sure what you meant by "Lock count of a queue". Could you explain in more details?
Author
Owner

@ajatprabha commented on GitHub (Sep 16, 2021):

I have not actually gone into asynq codebase, but in gocraft/work it maintains a lock count per jobType (also queue since there seems to be one queue per jobType). I believe this is not much relevant to asynq.

Additional Context: Here's the job fetch script of gocraft/work

But latency is something I think is worth exposing.

<!-- gh-comment-id:921227895 --> @ajatprabha commented on GitHub (Sep 16, 2021): I have not actually gone into asynq codebase, but in gocraft/work it maintains a [lock count](https://github.com/gocraft/work/blob/5959e69ad211c5ca37ffdf3ede02e35a5ae41d98/priority_sampler.go#L19) per jobType (also queue since there seems to be one queue per jobType). I believe this is not much relevant to asynq. Additional Context: Here's the job fetch [script](https://github.com/gocraft/work/blob/5959e69ad211c5ca37ffdf3ede02e35a5ae41d98/redis.go#L107-L154) of gocraft/work But latency is something I think is worth exposing.
Author
Owner

@huttotw commented on GitHub (Oct 17, 2021):

I have just started using Asynq in production and I am also looking for some out-of-the-box telemetry.

@hibiken what are your thoughts of exposing all of these things natively via the Prometheus format?

Prometheus also offers a client API that the asyncmon UI could use to consume all such stats in a nice way.

The big benefits to this method are:

  1. A single source of truth for stats (Prometheus)
  2. Users can use downstream tools (Grafana, Alertmanager, etc) to integrate their systems.
  3. Prometheus has great guidance and lots of standards on how metrics should be reported. Here are a list of exporters for common projects that we could model after.

Major hurdle:

  1. asynqmon would optionally depend on Prometheus 😬 . I would love to learn more about your roadmap to know if this is a deal breaker or not.

If the project were to fully embrace Prometheus, in my mind asynqmon would be in charge of administrative tasks (CRUD operations) of tasks, queues, etc. while Prometheus would be in charge of statistic reporting. You could optionally pass a --prometheus-addr to asynqmon and asynqmon would start to show statistics and graphs automatically.

For what it is worth, NSQ's admin tool operates under this model.

<!-- gh-comment-id:945162521 --> @huttotw commented on GitHub (Oct 17, 2021): I have just started using Asynq in production and I am also looking for some out-of-the-box telemetry. @hibiken what are your thoughts of exposing all of these things natively via the Prometheus format? Prometheus also offers a client API that the `asyncmon` UI could use to consume all such stats in a nice way. The big benefits to this method are: 1. A single source of truth for stats (Prometheus) 2. Users can use downstream tools (Grafana, Alertmanager, etc) to integrate their systems. 3. Prometheus has great guidance and lots of standards on how metrics should be reported. [Here are a list of exporters for common projects that we could model after](https://prometheus.io/docs/instrumenting/exporters/). Major hurdle: 1. `asynqmon` would optionally depend on Prometheus 😬 . I would love to learn more about your roadmap to know if this is a deal breaker or not. If the project were to fully embrace Prometheus, in my mind `asynqmon` would be in charge of administrative tasks (CRUD operations) of tasks, queues, etc. while Prometheus would be in charge of statistic reporting. You could optionally pass a `--prometheus-addr` to `asynqmon` and `asynqmon` would start to show statistics and graphs automatically. For what it is worth, [NSQ's admin tool](https://nsq.io/components/nsqadmin.html) operates under this model.
Author
Owner

@hibiken commented on GitHub (Oct 18, 2021):

@huttotw This is a great feedback, thank you!
I agree that we can leverage Prometheus to store time series data. Currently the asynqmon tool can only show snapshot of the current state, but it'd be very helpful to see the progression overtime.

If the project were to fully embrace Prometheus, in my mind asynqmon would be in charge of administrative tasks (CRUD operations) of tasks, queues, etc. while Prometheus would be in charge of statistic reporting. You could optionally pass a ----prometheus-addrtoasynqmonandasynqmon` would start to show statistics and graphs automatically.

I'm onboard with the idea at a high level 👍

Let's take a step back and list all the metrics we want to collect:
I think metrics of interest can be categorized in two groups:

  1. Metrics per server process level (i.e. Processes running the Handler)
  2. Metrics at global level

Example metrics in category1 include:

  • Number of tasks processed (label with queue, task-type, etc)
  • Number of tasks failed (label with queue, task-type, etc)
  • Memory/CPU usage of the process, etc

Example metrics in category2 include:

  • Queue sizes
  • Breakdown of each queue by state (i.e. active, pending, scheduled, retry, archived)
  • Queue latency, etc

Please let me know any specific metrics that you are interested in collecting.

Metrics in category1 can be exported to Prometheus by instrumentation, like the one described here.
As for the metrics in category2, I think we can write an exporter to export the data to Prometheus.

Let me know what metrics you are interested in. Also, it'd be helpful if you can provide a diagram to describe how each component works together :)

<!-- gh-comment-id:945716733 --> @hibiken commented on GitHub (Oct 18, 2021): @huttotw This is a great feedback, thank you! I agree that we can leverage Prometheus to store time series data. Currently the `asynqmon` tool can only show snapshot of the current state, but it'd be very helpful to see the progression overtime. > If the project were to fully embrace Prometheus, in my mind asynqmon would be in charge of administrative tasks (CRUD operations) of tasks, queues, etc. while Prometheus would be in charge of statistic reporting. You could optionally pass a ----prometheus-addrtoasynqmonandasynqmon` would start to show statistics and graphs automatically. I'm onboard with the idea at a high level 👍 Let's take a step back and list all the metrics we want to collect: I think metrics of interest can be categorized in two groups: 1. Metrics per server process level (i.e. Processes running the `Handler`) 2. Metrics at global level Example metrics in category1 include: - Number of tasks processed (label with queue, task-type, etc) - Number of tasks failed (label with queue, task-type, etc) - Memory/CPU usage of the process, etc Example metrics in category2 include: - Queue sizes - Breakdown of each queue by state (i.e. active, pending, scheduled, retry, archived) - Queue latency, etc Please let me know any specific metrics that you are interested in collecting. Metrics in category1 can be exported to Prometheus by instrumentation, like the one described [here](https://github.com/hibiken/asynq/wiki/Monitoring-and-Alerting). As for the metrics in category2, I think we can write an exporter to export the data to Prometheus. Let me know what metrics you are interested in. Also, it'd be helpful if you can provide a diagram to describe how each component works together :)
Author
Owner

@ajatprabha commented on GitHub (Oct 18, 2021):

I like the idea of having a way to export time-series based data for monitoring. However, it'd be super awesome if the API is designed in a way where the collection of metrics can be decoupled. What I mean is that let's have the default option as Prometheus but allow the users to change if needed, say with InfluxDB. You may look at OpenTelemetry to see if their API design would fit in here.

<!-- gh-comment-id:945830327 --> @ajatprabha commented on GitHub (Oct 18, 2021): I like the idea of having a way to export time-series based data for monitoring. However, it'd be super awesome if the API is designed in a way where the collection of metrics can be decoupled. What I mean is that let's have the default option as Prometheus but allow the users to change if needed, say with InfluxDB. You may look at OpenTelemetry to see if their API design would fit in here.
Author
Owner

@huttotw commented on GitHub (Oct 18, 2021):

@hibiken 👍

I agree with your assessment that most metrics fit into 1 of 2 categories. For category 1, perhaps we could collect metrics by default (or provide a standard middleware), and provide an http.HandlerFunc that a user can simply expose at /metrics if they want to scrape using Prometheus.

We would definitely want the metric names to be standardized if we are planning to use them ourselves in asynqmon.

For category 2, I have a couple more questions (more so roadmap questions):

  1. What is the reason that the client writes directly to Redis? Would it make more sense in this case for Enqueue to write to the asynq server, then the server write to Redis? My thought is that this would give us an intercept point to increment the queue size counters instead of polling Redis every 2 seconds (as in the Inspector example). I understand this would be a major change.
  2. For memory / CPU, it may be possible to move that into category 1 with a package like this: https://github.com/c9s/goprocinfo, and we may not need an exporter at all (going against my original comment 😄 ).

At the end of the day, as a user of this product, I think it would be great if all I had to do was:

http.HandleFunc("/metrics", myAsynqServer.Metrics)

Once we make the details a bit more firm, I'm happy to work on a diagram and contribute to the solution.

<!-- gh-comment-id:945857225 --> @huttotw commented on GitHub (Oct 18, 2021): @hibiken 👍 I agree with your assessment that most metrics fit into 1 of 2 categories. For category 1, perhaps we could collect metrics by default (or provide a standard middleware), and provide an `http.HandlerFunc` that a user can simply expose at `/metrics` if they want to scrape using Prometheus. We would definitely want the metric names to be standardized if we are planning to use them ourselves in `asynqmon`. For category 2, I have a couple more questions (more so roadmap questions): 1. What is the reason that the client writes directly to Redis? Would it make more sense in this case for Enqueue to write to the `asynq` server, then the server write to Redis? My thought is that this would give us an intercept point to increment the queue size counters instead of polling Redis every 2 seconds (as in the Inspector example). I understand this would be a **major change**. 2. For memory / CPU, it may be possible to move that into category 1 with a package like this: https://github.com/c9s/goprocinfo, and we may not need an exporter at all (going against my original comment 😄 ). At the end of the day, as a user of this product, I think it would be great if all I had to do was: ```go http.HandleFunc("/metrics", myAsynqServer.Metrics) ``` Once we make the details a bit more firm, I'm happy to work on a diagram and contribute to the solution.
Author
Owner

@hibiken commented on GitHub (Oct 20, 2021):

Thinking about this a bit more I think there's three separate services we want to monitor (correction to the previous comment about two categories, although they roughly matches to two of the three services that I'm thinking of).

  1. Worker jobs (i.e. processes running asynq.Handler)
  2. Client jobs (i.e. processes enqueueing tasks to Redis)
  3. Redis (for our purposes, queues and tasks)

I think for 1 and 2 we can let the user of the package instrument the code with prometheus client libraries and export the metrics data over HTTP for Prometheus server to scrape and collect.

For 3, we need an exporter since we cannot instrument redis itself, we can run a simple binary which inspects asynq queues and tasks expose those metrics to Prometheus server.

The diagram below describes what I'm currently thinking about how this would work:

  • The fat arrows show task data being sent from client jobs to redis, and then from redis to workers
  • The black solid arrows show how metrics data are pulled by Prometheus server
  • The dotted lines show how Asynqmon tool performs admin operations against queues and tasks, and how it read time-series data stored in Prometheus.

Asynq monitoring drawio

The keyword I took away from @huttotw's initial comment is out-of-the-box telemery. I think Asynq can offer some of these components so that user can use them out-of-the-box. Let me know if anyone has feedback or further thoughts on this!


What is the reason that the client writes directly to Redis? Would it make more sense in this case for Enqueue to write to the asynq server, then the server write to Redis? My thought is that this would give us an intercept point to increment the queue size counters instead of polling Redis every 2 seconds (as in the Inspector example). I understand this would be a major change.

I don't think we'll go down that route. Each asynq server is only responsible for handling tasks and queues don't belong to the server itself (multiple servers can consuming tasks from the same queue for instance).

Prometheus also offers a client API that the asynqmon UI could use to consume all such stats in a nice way.

@huttotw could you point me to the API you are referring to?

<!-- gh-comment-id:947301203 --> @hibiken commented on GitHub (Oct 20, 2021): Thinking about this a bit more I think there's three separate services we want to monitor (correction to the previous comment about two categories, although they roughly matches to two of the three services that I'm thinking of). 1. Worker jobs (i.e. processes running `asynq.Handler`) 2. Client jobs (i.e. processes enqueueing tasks to Redis) 3. Redis (for our purposes, queues and tasks) I think for 1 and 2 we can let the user of the package instrument the code with prometheus client libraries and export the metrics data over HTTP for Prometheus server to scrape and collect. For 3, we need an exporter since we cannot instrument redis itself, we can run a simple binary which inspects asynq queues and tasks expose those metrics to Prometheus server. The diagram below describes what I'm currently thinking about how this would work: - The fat arrows show task data being sent from client jobs to redis, and then from redis to workers - The black solid arrows show how metrics data are pulled by Prometheus server - The dotted lines show how Asynqmon tool performs admin operations against queues and tasks, and how it read time-series data stored in Prometheus. ![Asynq monitoring drawio](https://user-images.githubusercontent.com/10953044/138023458-4720f6e1-2b07-45b3-b03f-862b94b37003.png) The keyword I took away from @huttotw's initial comment is **out-of-the-box telemery**. I think Asynq can offer some of these components so that user can use them out-of-the-box. Let me know if anyone has feedback or further thoughts on this! --- > What is the reason that the client writes directly to Redis? Would it make more sense in this case for Enqueue to write to the asynq server, then the server write to Redis? My thought is that this would give us an intercept point to increment the queue size counters instead of polling Redis every 2 seconds (as in the Inspector example). I understand this would be a major change. I don't think we'll go down that route. Each asynq server is only responsible for handling tasks and queues don't belong to the server itself (multiple servers can consuming tasks from the same queue for instance). > Prometheus also offers a client API that the asynqmon UI could use to consume all such stats in a nice way. @huttotw could you point me to the API you are referring to?
Author
Owner

@huttotw commented on GitHub (Oct 20, 2021):

@hibiken nice diagram!

Okay, I think this makes a lot of sense. The barrier to writing a single exporter is much smaller than what I was suggesting 😄 .

Here is a Go client for using the Prometheus HTTP API, and here is the HTTP documentation. I believe it is what the Prometheus UI uses directly.

<!-- gh-comment-id:947841985 --> @huttotw commented on GitHub (Oct 20, 2021): @hibiken nice diagram! Okay, I think this makes a lot of sense. The barrier to writing a single exporter is much smaller than what I was suggesting 😄 . Here is a [Go client](https://github.com/prometheus/client_golang) for using the Prometheus HTTP API, and here is the [HTTP documentation](https://prometheus.io/docs/prometheus/latest/querying/api/). I believe it is what the Prometheus UI uses directly.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/asynq#2116
No description provided.