[GH-ISSUE #134] Crash when network is disconnected #83

Open
opened 2026-02-28 14:25:12 +03:00 by kerem · 2 comments
Owner

Originally created by @aykevl on GitHub (Nov 7, 2024).
Original GitHub issue: https://github.com/devgianlu/go-librespot/issues/134

I got this crash, I think when the network connection got disconnected:

ERRO[755131] did not receive last pong from dealer, 60s passed 
ERRO[755136] failed receiving dealer message               error="failed to get reader: WebSocket closed: sent close frame: status = StatusServiceRestart and reason = \"\""
ERRO[755161] did not receive last pong from dealer, 90s passed 
ERRO[755161] did not receive last pong ack from accesspoint, 173s passed 
ERRO[755161] failed receiving packet                       error="failed reading packet header: read tcp 10.65.42.112:51278->34.158.1.133:443: use of closed network connection"
WARN[755181] failed to connect to AP ap-gew4.spotify.com:80, retrying with a different AP  error="dial tcp: lookup ap-gew4.spotify.com on 10.65.42.1:53: read udp 10.65.42.112:60895->10.65.42.1:53: i/o timeout"
ERRO[755191] did not receive last pong from dealer, 120s passed 
WARN[755201] failed to connect to AP ap-guc3.spotify.com:4070, retrying with a different AP  error="dial tcp: lookup ap-guc3.spotify.com on 10.65.42.1:53: read udp 10.65.42.112:54788->10.65.42.1:53: i/o timeout"
ERRO[755221] did not receive last pong from dealer, 150s passed 
WARN[755221] failed to connect to AP ap-gew1.spotify.com:443, retrying with a different AP  error="dial tcp: lookup ap-gew1.spotify.com on 10.65.42.1:53: read udp 10.65.42.112:56707->10.65.42.1:53: i/o timeout"
WARN[755241] failed to connect to AP ap-gae2.spotify.com:80, retrying with a different AP  error="dial tcp: lookup ap-gae2.spotify.com on 10.65.42.1:53: read udp 10.65.42.112:46551->10.65.42.1:53: i/o timeout"
ERRO[755251] did not receive last pong from dealer, 180s passed 
WARN[755256] failed fetching new endpoint for accesspoint  error="failed fetching apresolve URL: Get \"https://apresolve.spotify.com/?type=accesspoint\": dial tcp: lookup apresolve.spotify.com on 10.65.42.1:53: read udp 10.65.42.112:46939->10.65.42.1:53: i/o timeout"
WARN[755271] failed to connect to AP ap-gew4.spotify.com:4070, retrying with a different AP  error="dial tcp: lookup ap-gew4.spotify.com: i/o timeout"
ERRO[755281] did not receive last pong ack from accesspoint, 293s passed 
ERRO[755281] did not receive last pong from dealer, 210s passed 
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0xc87ed8]

goroutine 15 [running]:
github.com/devgianlu/go-librespot/ap.(*Accesspoint).pongAckTicker(0x4000471880)
        /home/ayke/src/spotify/go-librespot/ap/ap.go:347 +0x1e8
created by github.com/devgianlu/go-librespot/ap.(*Accesspoint).startReceiving.func1 in goroutine 1
        /home/ayke/src/spotify/go-librespot/ap/ap.go:250 +0x10c

And this crash, probably the same bug, when the network definitely got disconnected (I think there was some maintenance on the router).

ERRO[937664] failed receiving packet                       error="failed reading packet header: read tcp 10.65.42.112:34414->104.199.65.9:443: read: connection reset by peer"                                                                                                            
INFO[937665] authenticated AP as 0bzabjufeuoh2wmg0n912ftgw                                                                                                                                                                                                                                
ERRO[938626] failed receiving packet                       error="failed reading packet header: read tcp 10.65.42.112:43240->104.199.241.202:80: read: connection reset by peer"                                                                                                          
INFO[938626] authenticated AP as 0bzabjufeuoh2wmg0n912ftgw                                                                                                                                                                                                                                
ERRO[952286] failed receiving packet                       error="failed reading packet header: EOF"                                                                                                                                                                                      
INFO[952287] authenticated AP as 0bzabjufeuoh2wmg0n912ftgw                                                                                                                                                                                                                                
ERRO[958771] did not receive last pong from dealer, 60s passed                                                                                                                                                                                                                            
ERRO[958776] failed receiving dealer message               error="failed to get reader: WebSocket closed: sent close frame: status = StatusServiceRestart and reason = \"\""                                                                                                              
ERRO[958801] did not receive last pong ack from accesspoint, 154s passed                                                                                                                                                                                                                  
ERRO[958801] did not receive last pong from dealer, 90s passed                                                                                                                                                                                                                            
ERRO[958801] failed receiving packet                       error="failed reading packet header: read tcp 10.65.42.112:39610->34.158.1.133:443: use of closed network connection"                                                                                                          
WARN[958821] failed to connect to AP ap-gew4.spotify.com:80, retrying with a different AP  error="dial tcp: lookup ap-gew4.spotify.com on 10.65.42.1:53: read udp 10.65.42.112:53441->10.65.42.1:53: i/o timeout"                                                                         
ERRO[958831] did not receive last pong from dealer, 120s passed                                                                                                                                                                                                                           
WARN[958841] failed to connect to AP ap-guc3.spotify.com:4070, retrying with a different AP  error="dial tcp: lookup ap-guc3.spotify.com on 10.65.42.1:53: read udp 10.65.42.112:50588->10.65.42.1:53: i/o timeout"                                                                       
ERRO[958861] did not receive last pong from dealer, 150s passed                                                                                                                                                                                                                           
WARN[958861] failed to connect to AP ap-gue1.spotify.com:443, retrying with a different AP  error="dial tcp: lookup ap-gue1.spotify.com on 10.65.42.1:53: read udp 10.65.42.112:47574->10.65.42.1:53: i/o timeout"                                                                        
WARN[958881] failed to connect to AP ap-gae2.spotify.com:80, retrying with a different AP  error="dial tcp: lookup ap-gae2.spotify.com on 10.65.42.1:53: read udp 10.65.42.112:40195->10.65.42.1:53: i/o timeout"                                                                         
ERRO[958891] did not receive last pong from dealer, 180s passed                                                                                                                                                                                                                           
WARN[958901] failed fetching new endpoint for accesspoint  error="failed fetching apresolve URL: Get \"https://apresolve.spotify.com/?type=accesspoint\": dial tcp: lookup apresolve.spotify.com on 10.65.42.1:53: read udp 10.65.42.112:40473->10.65.42.1:53: i/o timeout"               
WARN[958911] failed to connect to AP ap-gew4.spotify.com:4070, retrying with a different AP  error="dial tcp: lookup ap-gew4.spotify.com: i/o timeout"                                                                                                                                    
ERRO[958921] did not receive last pong ack from accesspoint, 274s passed                                                                                                                                                                                                                  
ERRO[958921] did not receive last pong from dealer, 210s passed                                                                                                                                                                                                                           
panic: runtime error: invalid memory address or nil pointer dereference                                                                                                                                                                                                                   
[signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0xc87ed8]

goroutine 78 [running]:
github.com/devgianlu/go-librespot/ap.(*Accesspoint).pongAckTicker(0x4000454000)
        /home/ayke/src/spotify/go-librespot/ap/ap.go:347 +0x1e8
created by github.com/devgianlu/go-librespot/ap.(*Accesspoint).startReceiving.func1 in goroutine 1
        /home/ayke/src/spotify/go-librespot/ap/ap.go:250 +0x10c

I suspect the issue is here:

github.com/devgianlu/go-librespot@f2de517abc/ap/ap.go (L82)

This is ap.init, which is called from ap.connect, which is called from ap.reconnect. Looks like it sets ap.conn to nil when it can't connect resulting in the nil pointer dereference.

(Also, I think ap.connMu should have been used inside pongAckTimer?)


I should also say it's quite impressive that the daemon had been running crash-free for over 10 days!

Originally created by @aykevl on GitHub (Nov 7, 2024). Original GitHub issue: https://github.com/devgianlu/go-librespot/issues/134 I got this crash, I think when the network connection got disconnected: ``` ERRO[755131] did not receive last pong from dealer, 60s passed ERRO[755136] failed receiving dealer message error="failed to get reader: WebSocket closed: sent close frame: status = StatusServiceRestart and reason = \"\"" ERRO[755161] did not receive last pong from dealer, 90s passed ERRO[755161] did not receive last pong ack from accesspoint, 173s passed ERRO[755161] failed receiving packet error="failed reading packet header: read tcp 10.65.42.112:51278->34.158.1.133:443: use of closed network connection" WARN[755181] failed to connect to AP ap-gew4.spotify.com:80, retrying with a different AP error="dial tcp: lookup ap-gew4.spotify.com on 10.65.42.1:53: read udp 10.65.42.112:60895->10.65.42.1:53: i/o timeout" ERRO[755191] did not receive last pong from dealer, 120s passed WARN[755201] failed to connect to AP ap-guc3.spotify.com:4070, retrying with a different AP error="dial tcp: lookup ap-guc3.spotify.com on 10.65.42.1:53: read udp 10.65.42.112:54788->10.65.42.1:53: i/o timeout" ERRO[755221] did not receive last pong from dealer, 150s passed WARN[755221] failed to connect to AP ap-gew1.spotify.com:443, retrying with a different AP error="dial tcp: lookup ap-gew1.spotify.com on 10.65.42.1:53: read udp 10.65.42.112:56707->10.65.42.1:53: i/o timeout" WARN[755241] failed to connect to AP ap-gae2.spotify.com:80, retrying with a different AP error="dial tcp: lookup ap-gae2.spotify.com on 10.65.42.1:53: read udp 10.65.42.112:46551->10.65.42.1:53: i/o timeout" ERRO[755251] did not receive last pong from dealer, 180s passed WARN[755256] failed fetching new endpoint for accesspoint error="failed fetching apresolve URL: Get \"https://apresolve.spotify.com/?type=accesspoint\": dial tcp: lookup apresolve.spotify.com on 10.65.42.1:53: read udp 10.65.42.112:46939->10.65.42.1:53: i/o timeout" WARN[755271] failed to connect to AP ap-gew4.spotify.com:4070, retrying with a different AP error="dial tcp: lookup ap-gew4.spotify.com: i/o timeout" ERRO[755281] did not receive last pong ack from accesspoint, 293s passed ERRO[755281] did not receive last pong from dealer, 210s passed panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0xc87ed8] goroutine 15 [running]: github.com/devgianlu/go-librespot/ap.(*Accesspoint).pongAckTicker(0x4000471880) /home/ayke/src/spotify/go-librespot/ap/ap.go:347 +0x1e8 created by github.com/devgianlu/go-librespot/ap.(*Accesspoint).startReceiving.func1 in goroutine 1 /home/ayke/src/spotify/go-librespot/ap/ap.go:250 +0x10c ``` And this crash, probably the same bug, when the network _definitely_ got disconnected (I think there was some maintenance on the router). ``` ERRO[937664] failed receiving packet error="failed reading packet header: read tcp 10.65.42.112:34414->104.199.65.9:443: read: connection reset by peer" INFO[937665] authenticated AP as 0bzabjufeuoh2wmg0n912ftgw ERRO[938626] failed receiving packet error="failed reading packet header: read tcp 10.65.42.112:43240->104.199.241.202:80: read: connection reset by peer" INFO[938626] authenticated AP as 0bzabjufeuoh2wmg0n912ftgw ERRO[952286] failed receiving packet error="failed reading packet header: EOF" INFO[952287] authenticated AP as 0bzabjufeuoh2wmg0n912ftgw ERRO[958771] did not receive last pong from dealer, 60s passed ERRO[958776] failed receiving dealer message error="failed to get reader: WebSocket closed: sent close frame: status = StatusServiceRestart and reason = \"\"" ERRO[958801] did not receive last pong ack from accesspoint, 154s passed ERRO[958801] did not receive last pong from dealer, 90s passed ERRO[958801] failed receiving packet error="failed reading packet header: read tcp 10.65.42.112:39610->34.158.1.133:443: use of closed network connection" WARN[958821] failed to connect to AP ap-gew4.spotify.com:80, retrying with a different AP error="dial tcp: lookup ap-gew4.spotify.com on 10.65.42.1:53: read udp 10.65.42.112:53441->10.65.42.1:53: i/o timeout" ERRO[958831] did not receive last pong from dealer, 120s passed WARN[958841] failed to connect to AP ap-guc3.spotify.com:4070, retrying with a different AP error="dial tcp: lookup ap-guc3.spotify.com on 10.65.42.1:53: read udp 10.65.42.112:50588->10.65.42.1:53: i/o timeout" ERRO[958861] did not receive last pong from dealer, 150s passed WARN[958861] failed to connect to AP ap-gue1.spotify.com:443, retrying with a different AP error="dial tcp: lookup ap-gue1.spotify.com on 10.65.42.1:53: read udp 10.65.42.112:47574->10.65.42.1:53: i/o timeout" WARN[958881] failed to connect to AP ap-gae2.spotify.com:80, retrying with a different AP error="dial tcp: lookup ap-gae2.spotify.com on 10.65.42.1:53: read udp 10.65.42.112:40195->10.65.42.1:53: i/o timeout" ERRO[958891] did not receive last pong from dealer, 180s passed WARN[958901] failed fetching new endpoint for accesspoint error="failed fetching apresolve URL: Get \"https://apresolve.spotify.com/?type=accesspoint\": dial tcp: lookup apresolve.spotify.com on 10.65.42.1:53: read udp 10.65.42.112:40473->10.65.42.1:53: i/o timeout" WARN[958911] failed to connect to AP ap-gew4.spotify.com:4070, retrying with a different AP error="dial tcp: lookup ap-gew4.spotify.com: i/o timeout" ERRO[958921] did not receive last pong ack from accesspoint, 274s passed ERRO[958921] did not receive last pong from dealer, 210s passed panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0xc87ed8] goroutine 78 [running]: github.com/devgianlu/go-librespot/ap.(*Accesspoint).pongAckTicker(0x4000454000) /home/ayke/src/spotify/go-librespot/ap/ap.go:347 +0x1e8 created by github.com/devgianlu/go-librespot/ap.(*Accesspoint).startReceiving.func1 in goroutine 1 /home/ayke/src/spotify/go-librespot/ap/ap.go:250 +0x10c ``` I suspect the issue is here: https://github.com/devgianlu/go-librespot/blob/f2de517abcf49bb58eb0ccfed6319bd58df94d81/ap/ap.go#L82 This is `ap.init`, which is called from `ap.connect`, which is called from `ap.reconnect`. Looks like it sets `ap.conn` to nil when it can't connect resulting in the nil pointer dereference. (Also, I think `ap.connMu` should have been used inside `pongAckTimer`?) --- I should also say it's quite impressive that the daemon had been running crash-free for over 10 days!
Author
Owner

@devgianlu commented on GitHub (Nov 9, 2024):

Also, I think ap.connMu should have been used inside pongAckTimer?

I have mixed feelings about this: yes it should be held for reading, but at the same time if we end up waiting because it is held for writing and then call close on it we are probably closing a connection that has just been created. Citing my own comment:

github.com/devgianlu/go-librespot@095020537e/ap/ap.go (L53-L56)


Main problem should be fixed with 095020537e, same issue as 0b32df773c.

<!-- gh-comment-id:2466171614 --> @devgianlu commented on GitHub (Nov 9, 2024): > Also, I think ap.connMu should have been used inside pongAckTimer? I have mixed feelings about this: yes it should be held for reading, but at the same time if we end up waiting because it is held for writing and then call close on it we are probably closing a connection that has just been created. Citing my own comment: https://github.com/devgianlu/go-librespot/blob/095020537ece27f8967a14819edfa4ed10350145/ap/ap.go#L53-L56 <hr> Main problem should be fixed with 095020537ece27f8967a14819edfa4ed10350145, same issue as 0b32df773c9cbe12a2bc44e92580d0c657a170a2.
Author
Owner

@aykevl commented on GitHub (Nov 9, 2024):

I have mixed feelings about this: yes it should be held for reading, but at the same time if we end up waiting because it is held for writing and then call close on it we are probably closing a connection that has just been created.

It's still a data race, which is basically undefined behavior (though it might work well enough in practice since it's a pointer read, which will normally not be torn).

Maybe I'll take a stab at fixing this at some point.

<!-- gh-comment-id:2466196687 --> @aykevl commented on GitHub (Nov 9, 2024): > I have mixed feelings about this: yes it should be held for reading, but at the same time if we end up waiting because it is held for writing and then call close on it we are probably closing a connection that has just been created. It's still a data race, which is basically undefined behavior (though it might work well enough in practice since it's a pointer read, which will normally not be torn). Maybe I'll take a stab at fixing this at some point.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/go-librespot#83
No description provided.