mirror of
https://github.com/AliAkhtari78/SpotifyScraper.git
synced 2026-04-25 19:45:49 +03:00
[GH-ISSUE #72] [BUG] #86
Labels
No labels
bug
bug
claude-assistant
claude-assistant
claude-assistant
dependencies
documentation
documentation
enhancement
in review list
infrastructure
infrastructure
infrastructure
pull-request
refactoring
release
stale
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/SpotifyScraper#86
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @alan7383 on GitHub (Jun 26, 2025).
Original GitHub issue: https://github.com/AliAkhtari78/SpotifyScraper/issues/72
Describe the bug
The get_playlist_info() function only retrieves the first 100 tracks from any playlist that contains more than 100 tracks. It seems the scraper does not handle the dynamic loading (infinite scroll) that Spotify's web player uses to display long playlists.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
I expected get_playlist_info() to return a list containing all tracks from the specified playlist. For the example URL, this should be over 100 tracks.
Actual behavior
The function successfully executes without errors but returns a list containing exactly 100 tracks. The len(playlist_info['tracks']) is always 100 for any playlist longer than that.
Error messages
Environment:
Additional context
This issue is likely caused by the fact that the Spotify web player dynamically loads tracks as the user scrolls down the page. The current implementation of the scraper seems to only parse the tracks that are present in the initial HTML DOM load, which is limited to the first 100 items. To get the full playlist, the scraper would need to simulate scrolling.
Possible solution
The fix would likely require modifying the scraping logic within the get_playlist_info method to handle dynamic content. When using the Selenium backend, a possible implementation could be:
Load the playlist page.
Enter a loop that:
a. Scrapes the currently visible tracks.
b. Programmatically scrolls the page down (e.g., driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")).
c. Waits for a brief moment for new content to load.
d. Checks if new track elements have appeared in the DOM.
Exit the loop when scrolling no longer loads new tracks.
Consolidate and return the full list of scraped tracks.
@Carpintonto commented on GitHub (Aug 20, 2025):
This is not a trivial coding problem, but the workaround I used is pretty trivial.
Split the list into smaller lists, or just take a subsection and make it a new list.
https://community.spotify.com/t5/Other-Podcasts-Partners-etc/How-to-break-up-a-playlist-into-smaller-ones/td-p/1234487
@kocijan commented on GitHub (Jan 14, 2026):
It's documented that this doesn't work:
Although a better warning might be useful. I support the above idea of scrolling and loading