mirror of
https://github.com/sigma67/ytmusicapi.git
synced 2026-04-25 15:26:01 +03:00
[GH-ISSUE #52] YTMusic responses are unreliable for get_library_songs and get_playlist #39
Labels
No labels
a/b
bug
documentation
enhancement
good first issue
help wanted
invalid
pull-request
question
wontfix
yt-error
yt-update
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/ytmusicapi#39
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @czifumasa on GitHub (Aug 1, 2020).
Original GitHub issue: https://github.com/sigma67/ytmusicapi/issues/52
In my project I am using ytmusicapi to fetch full content of the user's library and save it in csv file. Then I can use these csv files to compare changes in my library or find songs removed from youtube etc.
Unfortunately currently it's very unreliable.
For example: In my library currently I have 2040 songs. To get the library songs I call the api with high limit:
Everytime I send that request, the number of returned songs is different, it varies between 1800-2035 songs.
I know that the problem is in YTM itself, because I observed the same problem on the web client and it hasn't been fixed on their side for months. YTM should return library songs in chunks containing 25 items, but very often it's less than 25.
In the end, on average, at least 10% of my library is missing, making my scripts kinda useless. The same problem occurs for
get_playlistmethod.@sigma67 commented on GitHub (Aug 3, 2020):
I've noticed this issue as well since tests were failing randomly. I attempted to fix it in
90bc753. It might not be a fix if the API result skips songs randomly, in that case those would be missing from the response. In your experience, do invalid responses return songs in the correct order without skips? Or are songs randomly missing in between?To be honest I don't really like the option of implementing retry logic for a server-side issue, as it might become obsolete in the near future. I suggest we wait another month to see if the issue gets resolved by YouTube. If this is not the case, I'll go ahead and merge #53.
@czifumasa commented on GitHub (Aug 3, 2020):
Yes, from my observation, API skips songs randomly and fix from
90bc7538e4is not enough.That's completely fine for me. I know that "retry" solution is not very elegant, but unfortunately the problem exists since I moved from GPM so it's been at least a few months already. I kinda lost my patience and decided to workaround it with my PR. Although I agree with you, that proper fix should be on Youtube's server, so let's give them one more month.
@akraus53 commented on GitHub (Aug 19, 2020):
I think this is happening to getHistory() as well!
@xplorr commented on GitHub (Aug 19, 2020):
I use the ytmusic.get_library_upload_songs(50000) call and did not notice any problems so far. I have about 23000 songs in my library and they are all returned except 2. Have to figure out why 2 are missing.
@sigma67 commented on GitHub (Aug 25, 2020):
It's been almost a month with no updates from YouTube's side. I suggest we merge this PR, however I want to request two changes if possible.
The reasoning for 2) is that the changes from this PR doubled the average execution time for me (based on
test_get_library_songs- previously 3-4s, now 7-8s). I suggest we introduce an optional parametervalidate_responses=Falsefor get_library_songs. IfFalse, the current faulty behavior should occur by callingget_continuations.If True,get_validated_continuationsshould be used.The default should be False imo, since the objective of the API is to replicate the web client as closely as possible, which also exhibits this odd behavior. Therefore, it would be an optional feature of ytmusicapi, which validates responses for the user to ensure the response is correct. What do you think?
@sigma67 commented on GitHub (Aug 25, 2020):
In the original issue, you also noted that
get_playlisthas the same issue, but didn't end up including it in your PR. I just did some tests and it seems to behave consistently (i.e. no varying track counts). Am I correct in assuming that onlyget_library_songsis affected by this issue for now?@czifumasa commented on GitHub (Aug 26, 2020):
Yes, indeed, it seems that
get_library_songshas been fixed. Today, I've made some tests for both methods. I wasn't able to reproduce the problem forget_playlistanymore. At first I haven't include it in my PR, because I wasn't sure If you will approve the general concept so I created a fix for only one method. Luckily it's no longer needed.Unfortunately for
get_library_songsproblem still exists. I reproduced it in every test I've made.Regarding your proposed changes, I agree, retry behaviour should be optional. I'll update my PR next weekend when I will have a bit more time.
@czifumasa commented on GitHub (Aug 30, 2020):
I updated my PR(#53) with requested changes, please take a look.
@sigma67 commented on GitHub (Aug 31, 2020):
Thanks for updating the PR! I did some rather extensive testing with the changes and ran
get_library_songs(300, validate_responses=True)a few times. I noticed that retries only rarely managed to produce the full 25 results. If they did, it was always after the first retry. Unless you have significantly different results, I propose reducing themax_retriesto 1 to improve performance.(edit: I did some more tests and found 1 or 2 continuations where it worked after 2 or 3 tries (after >15 function calls with 11 continuations each). I believe the performance penalty isn't worth the additional 2 retries).
Check this log:
Here are the debug changes in utils.py l.104:
@sigma67 commented on GitHub (Aug 31, 2020):
I also found some isolated instances where the key
contentsis missing completely from the continuation response, causing an error.We should catch that in both
get_parsed_continuation_itemsandget_continuations. If you want you can add these changes as well, or I can do it.@sigma67 commented on GitHub (Sep 1, 2020):
After some more tests I decided to leave the retries at 3, as the number of retries to success seems to vary a lot depending on time of day and account used.
It also seems that the API "warms up" to your requests. For example, if you repeat the same call (
get_library_songs(300)) multiple times, subsequent calls have significantly fewer missing items and take less retries. This effect subsides after a while, so I suspect that YouTube's API uses some form of caching here.Will merge this PR shortly with the bugfix mentioned in the previous comment.
@czifumasa commented on GitHub (Sep 1, 2020):
Regarding the error with
contentkey, it never occurred for the accounts I tested. I am glad you found it and fixed it.And regarding
max_retriesparam, I observed exactly the same behaviour that you described, the first time when I useget_library_songsis usually the worst and requires many retries to get correct results. Next calls are much faster, but some continuations still require 1 or 2 retries.I set the
max_retriesto 3, because in my tests it returned the most consistent results. Decreasing it to 1 or 2 caused, that sometimes response still had missing songs. Increasing to values higher than 3 never worked. If YTM still sends response with less than 25 songs after more than 3 retries, it probably means that missing songs are permanently unavailable for some reason.@sigma67 commented on GitHub (Sep 29, 2022):
Hi, I'm curious. Are you still using this functionality? I feel like the API has gotten a lot more reliable and the code to achieve this is pretty messy. If it's not being used I'd rather remove it.
@xplorr commented on GitHub (Oct 4, 2022):
Still use this in my project
@sigma67 commented on GitHub (Oct 4, 2022):
Alright, good to know.