mirror of
https://github.com/devgianlu/go-librespot.git
synced 2026-04-26 05:15:49 +03:00
[GH-ISSUE #186] Too often error "did not receive last pong from dealer" #120
Labels
No labels
bug
enhancement
pull-request
spotify-side
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/go-librespot#120
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @neeohw on GitHub (May 7, 2025).
Original GitHub issue: https://github.com/devgianlu/go-librespot/issues/186
I've been wanting to move from librespot-java to go-librespot, but I have to restart the service almost twice a day due to the repeated error "did not receive last pong from dealer".
I just wanted to let you know that for me this library is not usable yet.
I do appreciate your efforts in rewriting it!
I am running it on an Orange Pi 3 LTS.
@2opremio commented on GitHub (May 26, 2025):
@devgianlu I am having exactly the same problem. I did move to go-librespot from spotifyd but I am now considering to switch back because of this :(
It doesn't happen twice a day but at least once every few days.
@2opremio commented on GitHub (May 26, 2025):
Here are some logs (but they extend back for hours):
@2opremio commented on GitHub (May 26, 2025):
It would already be an improvement if the program crashed (to force a restart) if it gets stuck in this state, since restarting seems to fix it.
@2opremio commented on GitHub (May 26, 2025):
This is how the last vicious error cycle started:
@devgianlu commented on GitHub (May 31, 2025):
Thank you for reporting. This doesn't really happen for me, but it is probably a relation between network instability and other factors.
Can you provide a dump of the goroutine state? It's likely that something is stuck somewhere. You can do it by sending a SIBABRT to go-librespot:
kill -s ABRT $(pidof go-librespot)@2opremio commented on GitHub (May 31, 2025):
Sure thing. I will do it the next time it happens. My WAN connection is pretty stable but I am running go-librespot on a raspberry pi 3 through WiFi which probably isn’t.
@devgianlu commented on GitHub (Jun 3, 2025):
Just happened to me I believe
@2opremio commented on GitHub (Jun 5, 2025):
I have thrown the dump into https://github.com/openziti/goroutine-analyzer and it seems like the player is stuck trying to acquire a mutex at
(*EventSender).Enqueue()@2opremio commented on GitHub (Jun 5, 2025):
Unfortunately I don't have the time to dive into the code and try to solve it right now.
@devgianlu commented on GitHub (Jun 6, 2025):
Yeah that was a problem, but that code isn't open source so it's impossible that it is causing problems for you. Need another dump.
@2opremio commented on GitHub (Jun 6, 2025):
I bet that we are having a deadlock with the same mutex though ... I will post a goroutine dump next time it happens.
@devgianlu commented on GitHub (Jun 13, 2025):
New sample
@devgianlu commented on GitHub (Jun 14, 2025):
In the above everything is locked up because:
snd_pcm_writeiis called while holdingalsaOutput.lockalsaOutput.DelayMswants to hold the lock becausePlayer.manageLoopreceived aplayerCmdPositionmessagePlayer.PositionMsis waiting for a response becauseskipNextwas calledDealer.handleRequestis waiting for a response on the skip next commandThis seems like an unusual scenario, but the overall problem is clear: if anything hangs, the receive loops are directly affected. However, why is
snd_pcm_writeistuck? Simply putting a timeout on 4 would hide the real problem.@2opremio commented on GitHub (Jul 4, 2025):
I just got another instance:
Logs:
Goroutine dump: https://pastebin.com/djA5HGvq
(It didn't fit as a comment)
@devgianlu commented on GitHub (Jul 12, 2025):
@2opremio I've pushed
f85a223eadto hopefully fix your particular issue.@2opremio commented on GitHub (Jul 12, 2025):
That’s great! I will test it right away.
Have you thought about a general solution for locking problem?
@devgianlu commented on GitHub (Jul 13, 2025):
There is no general solution I am afraid. The locking is due to the design of having one "thread" process all inputs and requests. I could put a timeout on how long we wait for something to be completed, but then something else would be frozen up and much more difficult to detect.
@2opremio commented on GitHub (Jul 13, 2025):
Fair enough!
@tooxo commented on GitHub (Jul 28, 2025):
One potential problem here is that
ChunkedReader.downloadChunkretries failed requests indefinitely, which could block the event loop forever, since the player downloads the first chunk synchronously in the event loop while loading a new song.I'm currently trying to prove this, but reproducing the issue is kind of a pain :/
Error logging inside the downloadChunk method would have gone a long way, as all error messages here are discarded anyway.
Edit:
This also renders these "error handlers" useless, as they will never be reached.
https://github.com/devgianlu/go-librespot/blob/master/audio/chunked_reader.go#L80
https://github.com/devgianlu/go-librespot/blob/master/audio/chunked_reader.go#L175
I understand the intention behind infinite retries, but if somehow this error is caused by a bad request or something comparable this gets the complete program stuck.
@devgianlu commented on GitHub (Aug 1, 2025):
@tooxo I think I had this happen just today:
Full goroutine dump
The "main loop" is stuck here:
The fact that the retries go on indefinitely is definitily wrong. I have put a limit of 3 retries in
github.com/devgianlu/go-librespot@05c1cc8c7b.@tooxo commented on GitHub (Aug 2, 2025):
Also: I did not have the time to research this thoroughly, but from what i saw, the http.Client used in the fetchChunk method does not have any timeout set, which could also lead to very long execution times if there is packet loss etc.
@JaragonCR commented on GitHub (Aug 13, 2025):
I'm trying to get iotsound formerly balenasound to use this. I seem to be able to reproduce this on demand, to the point that I can only play one song
Swapping songs, disconnecting from speaker, etc would cause this and the player hangs.
@devgianlu commented on GitHub (Aug 13, 2025):
@JaragonCR Would you able to send a SIGABRT to the daemon when it gets stuck to get the goroutine dump? You can do it from the terminal with
kill -s ABRT $(pidof go-librespot), depending on the name of the binary.@JaragonCR commented on GitHub (Aug 13, 2025):
@devgianlu the answer is yes, but since it is a container monitoring the process this restarts it, let me rebuild it with a shared directory to dump the core. Do you know on top of your head how to change the core dump location? I will probably get to this later today or this week.
@devgianlu commented on GitHub (Aug 13, 2025):
The goroutine dump is written to stderr (similarly to the ones above in the thread).
@JaragonCR commented on GitHub (Aug 13, 2025):
Definitely does not happen in this environment, need to see why...
@devgianlu commented on GitHub (Aug 13, 2025):
It is dumped by the
daemonprocess, most likely you'll see it in the docker logs before it restarts.@JaragonCR commented on GitHub (Aug 13, 2025):
@devgianlu commented on GitHub (Aug 14, 2025):
@JaragonCR I am unable to reproduce the issue. Could you try with the alsa backend?
@JaragonCR commented on GitHub (Aug 14, 2025):
@devgianlu not with how iotsound is designed. I could try to add the alsa bridge to the container to emulate alsa but it would take a me a while.
@JaragonCR commented on GitHub (Aug 14, 2025):
@devgianlu cannot reproduce with ALSA, works perfectly, you are onto something.
Here's my Docker build, needs cleanup, don't judge , it works :P
and here's the config it started with
@JaragonCR commented on GitHub (Aug 20, 2025):
@devgianlu did you get a chance to go over the pulse implementation?
@devgianlu commented on GitHub (Aug 20, 2025):
Please open another issue so that we can discuss thius further.
@devgianlu commented on GitHub (Aug 20, 2025):
I am going to close this issue because it has become a bit of catch all.
If you experience this problem again, please open a new issue providing a goroutine dump (see instructions above).
@neeohw commented on GitHub (Sep 12, 2025):
Just want to say that with all the latest work I haven't had this issue anymore. Thank you!