[GH-ISSUE #72] [BUG] #86

Open
opened 2026-03-13 23:01:21 +03:00 by kerem · 2 comments
Owner

Originally created by @alan7383 on GitHub (Jun 26, 2025).
Original GitHub issue: https://github.com/AliAkhtari78/SpotifyScraper/issues/72

Describe the bug
The get_playlist_info() function only retrieves the first 100 tracks from any playlist that contains more than 100 tracks. It seems the scraper does not handle the dynamic loading (infinite scroll) that Spotify's web player uses to display long playlists.

To Reproduce
Steps to reproduce the behavior:

# Code that causes the issue
from spotify_scraper import SpotifyClient
import pprint

# A public playlist with over 100 songs, perfect for testing the limit.
# "This Is Bad Bunny"
playlist_url = "https://open.spotify.com/playlist/37i9dQZF1DX2apWzyECwyZ" 

# Initialize the client. Using selenium as it's often needed for playlists.
client = SpotifyClient(browser_type="selenium")

try:
    print(f"Fetching playlist: {playlist_url}")
    playlist_info = client.get_playlist_info(playlist_url)

    # Check the number of tracks returned by the scraper
    track_count = len(playlist_info.get('tracks', []))
    
    print(f"Playlist Name: {playlist_info.get('name')}")
    print(f"Expected tracks: > 100 (actually 10,000+)")
    print(f"Tracks returned by scraper: {track_count}")

    # You can also print the last track to see where it stops
    if track_count > 0:
        pprint.pprint(playlist_info['tracks'][-1])

except Exception as e:
    print(f"An error occurred: {e}")

finally:
    client.close()

Expected behavior
I expected get_playlist_info() to return a list containing all tracks from the specified playlist. For the example URL, this should be over 100 tracks.

Actual behavior
The function successfully executes without errors but returns a list containing exactly 100 tracks. The len(playlist_info['tracks']) is always 100 for any playlist longer than that.

Error messages

No error messages are generated. The function fails silently by returning incomplete data.

Environment:

  • OS: Windows 11 26100.4202
  • Python version: 3.11
  • SpotifyScraper version: 2.1.5
  • Installation method: pip

Additional context
This issue is likely caused by the fact that the Spotify web player dynamically loads tracks as the user scrolls down the page. The current implementation of the scraper seems to only parse the tracks that are present in the initial HTML DOM load, which is limited to the first 100 items. To get the full playlist, the scraper would need to simulate scrolling.

Possible solution
The fix would likely require modifying the scraping logic within the get_playlist_info method to handle dynamic content. When using the Selenium backend, a possible implementation could be:

  1. Load the playlist page.

  2. Enter a loop that:
    a. Scrapes the currently visible tracks.
    b. Programmatically scrolls the page down (e.g., driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")).
    c. Waits for a brief moment for new content to load.
    d. Checks if new track elements have appeared in the DOM.

  3. Exit the loop when scrolling no longer loads new tracks.

  4. Consolidate and return the full list of scraped tracks.

Originally created by @alan7383 on GitHub (Jun 26, 2025). Original GitHub issue: https://github.com/AliAkhtari78/SpotifyScraper/issues/72 **Describe the bug** The get_playlist_info() function only retrieves the first 100 tracks from any playlist that contains more than 100 tracks. It seems the scraper does not handle the dynamic loading (infinite scroll) that Spotify's web player uses to display long playlists. **To Reproduce** Steps to reproduce the behavior: ```python # Code that causes the issue from spotify_scraper import SpotifyClient import pprint # A public playlist with over 100 songs, perfect for testing the limit. # "This Is Bad Bunny" playlist_url = "https://open.spotify.com/playlist/37i9dQZF1DX2apWzyECwyZ" # Initialize the client. Using selenium as it's often needed for playlists. client = SpotifyClient(browser_type="selenium") try: print(f"Fetching playlist: {playlist_url}") playlist_info = client.get_playlist_info(playlist_url) # Check the number of tracks returned by the scraper track_count = len(playlist_info.get('tracks', [])) print(f"Playlist Name: {playlist_info.get('name')}") print(f"Expected tracks: > 100 (actually 10,000+)") print(f"Tracks returned by scraper: {track_count}") # You can also print the last track to see where it stops if track_count > 0: pprint.pprint(playlist_info['tracks'][-1]) except Exception as e: print(f"An error occurred: {e}") finally: client.close() ``` **Expected behavior** I expected get_playlist_info() to return a list containing all tracks from the specified playlist. For the example URL, this should be over 100 tracks. **Actual behavior** The function successfully executes without errors but returns a list containing exactly 100 tracks. The len(playlist_info['tracks']) is always 100 for any playlist longer than that. **Error messages** ``` No error messages are generated. The function fails silently by returning incomplete data. ``` **Environment:** - OS: Windows 11 26100.4202 - Python version: 3.11 - SpotifyScraper version: 2.1.5 - Installation method: pip **Additional context** This issue is likely caused by the fact that the Spotify web player dynamically loads tracks as the user scrolls down the page. The current implementation of the scraper seems to only parse the tracks that are present in the initial HTML DOM load, which is limited to the first 100 items. To get the full playlist, the scraper would need to simulate scrolling. **Possible solution** The fix would likely require modifying the scraping logic within the get_playlist_info method to handle dynamic content. When using the Selenium backend, a possible implementation could be: 1. Load the playlist page. 2. Enter a loop that: a. Scrapes the currently visible tracks. b. Programmatically scrolls the page down (e.g., driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")). c. Waits for a brief moment for new content to load. d. Checks if new track elements have appeared in the DOM. 3. Exit the loop when scrolling no longer loads new tracks. 4. Consolidate and return the full list of scraped tracks.
Author
Owner

@Carpintonto commented on GitHub (Aug 20, 2025):

This is not a trivial coding problem, but the workaround I used is pretty trivial.
Split the list into smaller lists, or just take a subsection and make it a new list.

https://community.spotify.com/t5/Other-Podcasts-Partners-etc/How-to-break-up-a-playlist-into-smaller-ones/td-p/1234487

<!-- gh-comment-id:3208342487 --> @Carpintonto commented on GitHub (Aug 20, 2025): This is not a trivial coding problem, but the workaround I used is pretty trivial. Split the list into smaller lists, or just take a subsection and make it a new list. https://community.spotify.com/t5/Other-Podcasts-Partners-etc/How-to-break-up-a-playlist-into-smaller-ones/td-p/1234487
Author
Owner

@kocijan commented on GitHub (Jan 14, 2026):

It's documented that this doesn't work:

Note:
- For very large playlists (>100 tracks), only the first 100 tracks
may be returned depending on the extraction method used.
- Private playlists require authentication to access.

Although a better warning might be useful. I support the above idea of scrolling and loading

<!-- gh-comment-id:3749041162 --> @kocijan commented on GitHub (Jan 14, 2026): It's [documented](https://github.com/AliAkhtari78/SpotifyScraper/blob/b28196281c957ea7f11dcdae00113287dc89b080/src/spotify_scraper/client.py#L494-L497) that this doesn't work: > Note: > - For very large playlists (>100 tracks), only the first 100 tracks > may be returned depending on the extraction method used. > - Private playlists require authentication to access. Although a better warning might be useful. I support the above idea of scrolling and loading
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/SpotifyScraper#86
No description provided.