[GH-ISSUE #134] Handle reconnection for Sessions #103

Closed
opened 2026-02-27 19:28:51 +03:00 by kerem · 8 comments
Owner

Originally created by @sashahilton00 on GitHub (Feb 8, 2018).
Original GitHub issue: https://github.com/librespot-org/librespot/issues/134

This issue serves as a placeholder for discussion around the rewrite of the session handling logic, as it is currently one of the less stable parts of librespot. Issue #103 is related to this.

Originally created by @sashahilton00 on GitHub (Feb 8, 2018). Original GitHub issue: https://github.com/librespot-org/librespot/issues/134 This issue serves as a placeholder for discussion around the rewrite of the session handling logic, as it is currently one of the less stable parts of librespot. Issue #103 is related to this.
kerem 2026-02-27 19:28:51 +03:00
Author
Owner

@plietar commented on GitHub (Feb 10, 2018):

So here's my brain dump about the whole reconnection thing.

Currently one Session object is associated with one TCP connection to Spotify servers. We can choose to keep it this way and start returning appropriate errors to the caller, making it main.rs (or any user of librespot) responsible for creating a new Session and switching to that. This is pretty tricky and puts the burden on every end application using librespot. Note however that the "switch session" part already kind of exists when switching users with discovery.

The alternative is to move to a model where a Session object can alternate back and forth between a connected and a not connected state. The API for Session becomes something like :

impl Session {
    fn new(config: SessionConfig, cache: Option<Cache>, handle: Handle) -> Session { ... }
    fn connect(&self, credentials: Credentials) -> impl Future<Credentials, ConnectionError> { ... }
}

The connect function can be called multiple times (switching users when discovery is enabled for example), and should be called again when a disconnection is noticed.

Trying to use an unconnected session (to get metadata, audio key/data, etc...) should fail gracefully. All the APIs already use Future/Stream, so these just need to resolve into errors. Same happens if the connection is lost while a request is in flight.

Player and Spirc need to behave correctly when those requests fail. For Player it just means returning to the NotPlaying state and signal the caller that playback has completed with an error. For Spirc, when Player signals a connection problem it should probably stop playback. Reading from or sending to the mercury channel may start failing as well.

Now to get back up, I suggest the Session signals in some way to main that connection was closed, and main is responsible for attempting to reconnect by calling connect again, potentially with some exponential back off. It should do so by using the Credential that was returned by the last successful connect call. The various components get notified of this and need to carry over with the new connection.

This can mean various things depending on the component. Spirc for example should reset its state if the username is different, and reestablish the mercury channel.

The components need to expose an API that looks like:

impl Component {
    fn connection_established(&self, username: String)  { ... }
    fn dispatch(&self, cmd: u8, mut data: Bytes) { ... }
    fn connection_closed(&self)  { ... }
}

There are some risks of race conditions that need to be carefully though about.

I think I have some initial implementation of this somewhere, I'll try and find it again.

<!-- gh-comment-id:364609955 --> @plietar commented on GitHub (Feb 10, 2018): So here's my brain dump about the whole reconnection thing. Currently one `Session` object is associated with one TCP connection to Spotify servers. We can choose to keep it this way and start returning appropriate errors to the caller, making it `main.rs` (or any user of librespot) responsible for creating a new Session and switching to that. This is pretty tricky and puts the burden on every end application using librespot. Note however that the "switch session" part already kind of exists when switching users with discovery. The alternative is to move to a model where a `Session` object can alternate back and forth between a connected and a not connected state. The API for `Session` becomes something like : ```rust impl Session { fn new(config: SessionConfig, cache: Option<Cache>, handle: Handle) -> Session { ... } fn connect(&self, credentials: Credentials) -> impl Future<Credentials, ConnectionError> { ... } } ``` The `connect` function can be called multiple times (switching users when discovery is enabled for example), and should be called again when a disconnection is noticed. Trying to use an unconnected session (to get metadata, audio key/data, etc...) should fail gracefully. All the APIs already use Future/Stream, so these just need to resolve into errors. Same happens if the connection is lost while a request is in flight. Player and Spirc need to behave correctly when those requests fail. For Player it just means returning to the NotPlaying state and signal the caller that playback has completed with an error. For Spirc, when Player signals a connection problem it should probably stop playback. Reading from or sending to the mercury channel may start failing as well. Now to get back up, I suggest the Session signals in some way to `main` that connection was closed, and `main` is responsible for attempting to reconnect by calling `connect` again, potentially with some exponential back off. It should do so by using the `Credential` that was returned by the last successful `connect` call. The various components get notified of this and need to carry over with the new connection. This can mean various things depending on the component. `Spirc` for example should reset its state if the username is different, and reestablish the mercury channel. The components need to expose an API that looks like: ```rust impl Component { fn connection_established(&self, username: String) { ... } fn dispatch(&self, cmd: u8, mut data: Bytes) { ... } fn connection_closed(&self) { ... } } ``` There are some risks of race conditions that need to be carefully though about. I think I have some initial implementation of this somewhere, I'll try and find it again.
Author
Owner

@sashahilton00 commented on GitHub (Feb 17, 2018):

@plietar did you find that initial implementation anywhere?

<!-- gh-comment-id:366454783 --> @sashahilton00 commented on GitHub (Feb 17, 2018): @plietar did you find that initial implementation anywhere?
Author
Owner

@sashahilton00 commented on GitHub (Feb 26, 2018):

Regardless, I think we should be aiming to make this as simple to integrate as possible, as handling reconnection logic from whatever project uses librespot sounds like plenty of headache for anyone who tries to use librespot, whilst also being pretty low level functionality that I personally would expect to be handled in any library I used. Anyway, these are just my two cents since I'm not sufficently versed in rust to be able to rewrite how sessions are handled. Would be good to hear others thoughts as to how this should be handled.

<!-- gh-comment-id:368371586 --> @sashahilton00 commented on GitHub (Feb 26, 2018): Regardless, I think we should be aiming to make this as simple to integrate as possible, as handling reconnection logic from whatever project uses librespot sounds like plenty of headache for anyone who tries to use librespot, whilst also being pretty low level functionality that I personally would expect to be handled in any library I used. Anyway, these are just my two cents since I'm not sufficently versed in rust to be able to rewrite how sessions are handled. Would be good to hear others thoughts as to how this should be handled.
Author
Owner

@lrbalt commented on GitHub (Apr 30, 2018):

I am trying to wrap my head around this issue.

AFAICS, the disconnect error starts at Session::DispatchTask::poll. The error causes the invalid flag to be set on Session (patch #206) and the error is returned from the future and picked up by the map_err on Session::connect which only logs an error. Both Spirc::poll and Player::poll check on a valid session and they shut down if they notice that the session is invalid, freeing all resources like mercury

Since Player is managed by Spirc, I think we can improve this by only checking on Session at Spirc::poll and let spirc inform Player to stop/shutdown.

main currently does not notice the lost connection at all which is probably wrong since it does not clean up spirc and spirc_task and player_event_channel. But they are replaced when a new connection is made, so no real harm done. This can be improved by letting spirc_task return an error and handle this in main.

So this works for the binary case.

For the library case @plietar suggests to let main reconnect using last credentials and let Spirc, Player, etc. pick up the new session and reset its state accordingly. I have two problems trying to implement this

  • session is currently a future which requires AFAIK a 'static lifetime. Making Session::new causes the Session state to be non-static. I couldn't get Session::new() followed by Session::connect(&mut self) to work because &self not being 'static
  • why is it difficult for the library to let main recreate Session/Spirc/Player from the credentials that main manages anyway? We could let Spirc return with a ConnectionError? I'm trying to figure out the use case.

If you look at examples/play.rs you can do a match on core.run(player.load(track, true, 0)) and handle the ConnectionError??

<!-- gh-comment-id:385401926 --> @lrbalt commented on GitHub (Apr 30, 2018): I am trying to wrap my head around this issue. AFAICS, the disconnect error starts at `Session::DispatchTask::poll`. The error causes the `invalid` flag to be set on `Session` (patch #206) and the error is returned from the future and picked up by the `map_err` on `Session::connect` which only logs an error. Both `Spirc::poll` and `Player::poll` check on a valid session and they shut down if they notice that the session is invalid, freeing all resources like `mercury` Since `Player` is managed by `Spirc`, I think we can improve this by only checking on `Session` at `Spirc::poll` and let `spirc` inform `Player` to stop/shutdown. `main` currently does not notice the lost connection at all which is probably wrong since it does not clean up `spirc` and `spirc_task` and `player_event_channel`. But they are replaced when a new connection is made, so no real harm done. This can be improved by letting `spirc_task` return an error and handle this in `main`. So this works for the binary case. For the library case @plietar suggests to let main reconnect using last credentials and let Spirc, Player, etc. pick up the new session and reset its state accordingly. I have two problems trying to implement this * session is currently a future which requires AFAIK a `'static` lifetime. Making Session::new causes the Session state to be non-static. I couldn't get `Session::new()` followed by `Session::connect(&mut self)` to work because &self not being `'static` * why is it difficult for the library to let main recreate `Session/Spirc/Player` from the credentials that main manages anyway? We could let `Spirc` return with a `ConnectionError`? I'm trying to figure out the use case. If you look at examples/play.rs you can do a match on `core.run(player.load(track, true, 0))` and handle the `ConnectionError`??
Author
Owner

@maciex commented on GitHub (Nov 21, 2018):

Is there any way to mitigate this problem?
Is it possible to somehow detect that it disconnected and restart it manually (using some shell script)?

<!-- gh-comment-id:440758798 --> @maciex commented on GitHub (Nov 21, 2018): Is there any way to mitigate this problem? Is it possible to somehow detect that it disconnected and restart it manually (using some shell script)?
Author
Owner

@oboote commented on GitHub (Nov 26, 2018):

Is there any way to mitigate this problem?
Is it possible to somehow detect that it disconnected and restart it manually (using some shell script)?

This doesn't address the cause but aids with the symptoms:
https://gist.github.com/JeremieRapin/bc35a2632c072a082d48f5503270df16

Script to be run as a cronjob every minute which checks the status and restarts the service if an error is spotted.

<!-- gh-comment-id:441499150 --> @oboote commented on GitHub (Nov 26, 2018): > Is there any way to mitigate this problem? > Is it possible to somehow detect that it disconnected and restart it manually (using some shell script)? This doesn't address the cause but aids with the symptoms: https://gist.github.com/JeremieRapin/bc35a2632c072a082d48f5503270df16 Script to be run as a cronjob every minute which checks the status and restarts the service if an error is spotted.
Author
Owner

@cdlenfert commented on GitHub (Jul 29, 2019):

Is the cronjob patch the best solution for this issue currently? Sometimes I'm 25 songs in, sometimes I'm 4 when I lose connection. This is running on a server connected directly to my modem/router via ethernet. I can't imagine it's actually losing internet connection this consistently.

<!-- gh-comment-id:516029981 --> @cdlenfert commented on GitHub (Jul 29, 2019): Is the cronjob patch the best solution for this issue currently? Sometimes I'm 25 songs in, sometimes I'm 4 when I lose connection. This is running on a server connected directly to my modem/router via ethernet. I can't imagine it's actually losing internet connection this consistently.
Author
Owner

@willstott101 commented on GitHub (Oct 19, 2019):

This can mean various things depending on the component. Spirc for example should reset its state if the username is different, and reestablish the mercury channel.

Sounds like this is closely related to, and would probably help solve #266

<!-- gh-comment-id:544127551 --> @willstott101 commented on GitHub (Oct 19, 2019): > This can mean various things depending on the component. `Spirc` for example should reset its state if the username is different, and reestablish the mercury channel. Sounds like this is closely related to, and would probably help solve #266
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/librespot#103
No description provided.