[GH-ISSUE #979] [BUG] asynq:servers and asynq:workers not cleaned after shutdown #1484

Closed
opened 2026-03-07 22:09:56 +03:00 by kerem · 6 comments
Owner

Originally created by @Skwol on GitHub (Dec 6, 2024).
Original GitHub issue: https://github.com/hibiken/asynq/issues/979

Originally assigned to: @hibiken, @kamikazechaser on GitHub.

Describe the bug
redis asynq:servers and asynq:workers sets are note being cleaned up during server shutdown.

Environment (please complete the following information):

  • OS: macos (linux)
  • asynq package version v0.25.0
  • Redis 7

To Reproduce

  1. Run asynq locally with a redis in a docker container
  2. Stop the app and make sure that server.Shutdown() called.
  3. Open redis and verify that asynq:servers and asynq:workers entires were not removed.

Perform same actions but add time.Sleep(2*time.Second) after server.Shutdown() and you can see that closed workers/servers cleaned up.

Dockerfile:

FROM redis:7.0-alpine
CMD ["redis-server", "--appendonly", "yes"]

docker-compose.yaml:

version: "3.9"

services:
  redis:
    image: redis:7.0-alpine
    container_name: redis
    ports:
      - "6379:6379"
    volumes:
      - redis_data:/data

volumes:
  redis_data:

main.go:

package main

import (
	"github.com/hibiken/asynq"
	"log"
	"os"
	"os/signal"
	"syscall"
)

const redisAddr = "localhost:6379"

func main() {
	server := asynq.NewServer(asynq.RedisClientOpt{Addr: redisAddr}, asynq.Config{
		Concurrency: 10,
	})

	mux := asynq.NewServeMux()

	go func() {
		if err := server.Run(mux); err != nil {
			log.Fatalf("Asynq Server failed: %v", err)
		}
	}()

	signalChan := make(chan os.Signal, 1)
	signal.Notify(signalChan, syscall.SIGINT, syscall.SIGTERM)

	<-signalChan
	log.Println("Shutting down server...")
	server.Stop()
	server.Shutdown()
	//time.Sleep(2 * time.Second) // give the server some time to shutdown gracefully
	log.Println("Server shut down gracefully")
}

Expected behavior
redis sets asynq:servers and asynq:workers should not contain old servers/workers

Additional context
In some infrastructure where we might have a lot of workers these sets become huge during active development. Not sure when it will affect performance or cause any troubles.

Originally created by @Skwol on GitHub (Dec 6, 2024). Original GitHub issue: https://github.com/hibiken/asynq/issues/979 Originally assigned to: @hibiken, @kamikazechaser on GitHub. **Describe the bug** redis `asynq:servers` and `asynq:workers` sets are note being cleaned up during server shutdown. **Environment (please complete the following information):** - OS: macos (linux) - `asynq` package version v0.25.0 - Redis 7 **To Reproduce** 1. Run asynq locally with a redis in a docker container 2. Stop the app and make sure that server.Shutdown() called. 3. Open redis and verify that asynq:servers and asynq:workers entires were not removed. Perform same actions but add time.Sleep(2*time.Second) after server.Shutdown() and you can see that closed workers/servers cleaned up. Dockerfile: ``` FROM redis:7.0-alpine CMD ["redis-server", "--appendonly", "yes"] ``` docker-compose.yaml: ``` version: "3.9" services: redis: image: redis:7.0-alpine container_name: redis ports: - "6379:6379" volumes: - redis_data:/data volumes: redis_data: ``` main.go: ``` package main import ( "github.com/hibiken/asynq" "log" "os" "os/signal" "syscall" ) const redisAddr = "localhost:6379" func main() { server := asynq.NewServer(asynq.RedisClientOpt{Addr: redisAddr}, asynq.Config{ Concurrency: 10, }) mux := asynq.NewServeMux() go func() { if err := server.Run(mux); err != nil { log.Fatalf("Asynq Server failed: %v", err) } }() signalChan := make(chan os.Signal, 1) signal.Notify(signalChan, syscall.SIGINT, syscall.SIGTERM) <-signalChan log.Println("Shutting down server...") server.Stop() server.Shutdown() //time.Sleep(2 * time.Second) // give the server some time to shutdown gracefully log.Println("Server shut down gracefully") } ``` **Expected behavior** redis sets `asynq:servers` and `asynq:workers` should not contain old servers/workers **Additional context** In some infrastructure where we might have a lot of workers these sets become huge during active development. Not sure when it will affect performance or cause any troubles.
kerem 2026-03-07 22:09:56 +03:00
Author
Owner

@Skwol commented on GitHub (Dec 6, 2024):

As I understand the issue lies somewhere in the lock of the server.Shutdown(). In my code I call server.Run() and server.Shutdown(). And server.Run() actually already have signals waiting and Shutdown(). When I remove server.Shutdown() from a client code everything runs smoothly and clean up happens.

Ideally need to handle this case properly (lock) or remove shutdown from the public api.

<!-- gh-comment-id:2523297200 --> @Skwol commented on GitHub (Dec 6, 2024): As I understand the issue lies somewhere in the lock of the server.Shutdown(). In my code I call server.Run() and server.Shutdown(). And server.Run() actually already have signals waiting and Shutdown(). When I remove server.Shutdown() from a client code everything runs smoothly and clean up happens. Ideally need to handle this case properly (lock) or remove shutdown from the public api.
Author
Owner

@kamikazechaser commented on GitHub (Dec 7, 2024):

Server.Run internally registers SIGINT listeners. Could you use Server.Start instead because you are registering your own listeners.

Also, could you try with v0.24.1 and see if it is reproducible.

<!-- gh-comment-id:2524954847 --> @kamikazechaser commented on GitHub (Dec 7, 2024): Server.Run internally registers SIGINT listeners. Could you use Server.Start instead because you are registering your own listeners. Also, could you try with v0.24.1 and see if it is reproducible.
Author
Owner

@Skwol commented on GitHub (Dec 7, 2024):

Tried the same with v0.24.1 and the result is the same. It requires use some delay/sleep to clean up.
You're 100% sure, I need to use Start + Shutdown or Run. It works for me and I'll be fine, but it looks like it better to have this case (case when user tries Run+Shutdown) restricted/handled somehow.

Anyway, thank you for the response.

<!-- gh-comment-id:2524993914 --> @Skwol commented on GitHub (Dec 7, 2024): Tried the same with v0.24.1 and the result is the same. It requires use some delay/sleep to clean up. You're 100% sure, I need to use Start + Shutdown or Run. It works for me and I'll be fine, but it looks like it better to have this case (case when user tries Run+Shutdown) restricted/handled somehow. Anyway, thank you for the response.
Author
Owner

@kamikazechaser commented on GitHub (Dec 9, 2024):

I managed to reproduce it. It is in fact a bug. If the server state is never set with Stop the entire shutdown procedure is skipped. On Unix it is never stopped unless SIGSTP is called.

https://github.com/hibiken/asynq/blob/master/signals_unix.go#L25

The 2 second that you add allows the Stop to actually set the correct state which further allows the seconds explicit shutdown to complete before the goroutine can return.

I'll push some code that you can try out.

<!-- gh-comment-id:2527108498 --> @kamikazechaser commented on GitHub (Dec 9, 2024): I managed to reproduce it. It is in fact a bug. If the server state is never set with Stop the entire shutdown procedure is skipped. On Unix it is never stopped unless SIGSTP is called. https://github.com/hibiken/asynq/blob/master/signals_unix.go#L25 The 2 second that you add allows the Stop to actually set the correct state which further allows the seconds explicit shutdown to complete before the goroutine can return. I'll push some code that you can try out.
Author
Owner

@kamikazechaser commented on GitHub (Dec 9, 2024):

@Skwol Could you try:

go get github.com/hibiken/asynq@71c746d00af93fed64d0daafd8fb5e884f7cc243
<!-- gh-comment-id:2527115210 --> @kamikazechaser commented on GitHub (Dec 9, 2024): @Skwol Could you try: ```bash go get github.com/hibiken/asynq@71c746d00af93fed64d0daafd8fb5e884f7cc243 ```
Author
Owner

@Skwol commented on GitHub (Dec 9, 2024):

I updated my code. The cleanup worked fine with this version. Thank you, hope to see it in the new release!

<!-- gh-comment-id:2527132299 --> @Skwol commented on GitHub (Dec 9, 2024): I updated my code. The cleanup worked fine with this version. Thank you, hope to see it in the new release!
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/asynq#1484
No description provided.