[GH-ISSUE #1085] Spotipy stops extracting API data without giving an error code and renders credentials unusable #645

Closed
opened 2026-02-28 00:00:29 +03:00 by kerem · 5 comments
Owner

Originally created by @omendo-galeo on GitHub (Apr 9, 2024).
Original GitHub issue: https://github.com/spotipy-dev/spotipy/issues/1085

I am trying to extract information about artists, albums and songs. The problem comes when after one iteration (approximately 1000 songs, 1000 albums and 1200 artists), the API stops working, but without returning a 429 error code or anything, it just waits, so I can't handle exceptions of any kind.

I have tried to make a custom session of requests to prevent it from staying in the retry state indefinitely but it does not react.

This code is only to extract tracks info:

import csv
import os
import json
import random
import spotipy
import time

from datetime import datetime
from dateutil import parser as date_parser
from dateutil.parser import ParserError
from tqdm import tqdm


def open_chunk_id(n_json):
    csv_file_path = f'output_chunks/chunk_{n_json}.csv'
    with open(csv_file_path, newline='') as csvfile:
        track_ids_reader = csv.reader(csvfile)
        next(track_ids_reader)  # Skip the header row
        track_ids = [row[0] for row in track_ids_reader]
    return track_ids


def fetch_track_info_from_csv(track_ids, access_token):
    track_data = []
    sp = spotipy.Spotify(auth=access_token)
    scrapped_time = datetime.now()
    for track_id in tqdm(track_ids, desc='Fetching track information', position=0):
        try:
            time.sleep(random.uniform(0, 0.5))
            track_info = sp.track(str(track_id))
            track_data.append({
                'Track ID': track_id,
                'Track Name': track_info['name'],
                'Artist(s)': [artist['name'] for artist in track_info['artists']],
                'Album': track_info['album']['name'],
                'Release Date': track_info['album']['release_date'],  
                'Popularity': track_info['popularity'],
                'Duration (ms)': track_info['duration_ms'],
                'Explicit': track_info['explicit'],
                'Track Number': track_info['track_number'],
                'URI': track_info['uri'].replace('spotify:track:', ''),
                'Album ID': track_info['album']['id'],
                'Artist ID(s)': [artist['id'] for artist in track_info['artists']],
                'Scrapped Time': scrapped_time.strftime('%Y-%m-%d %H:%M:%S')
            }) 
        except Exception as e:
            print(f"Failed to fetch information for track ID {track_id}: {str(e)}")
    
    return track_data


def save_track_data_to_json(track_data, output_json_file):
    # Restructure data to match JSON format
    json_data = []
    for track_info in track_data:
        try:
            parsed_date = date_parser.parse(track_info['Release Date'])
            if parsed_date.day == 1:
                # Case 1: Only year provided
                release_date = parsed_date.strftime('%Y-01-01')
            else:
                # Case 3: All info provided
                release_date = parsed_date.strftime('%Y-%m-%d')
        except ParserError:
            print(f"ParserError: Failed to parse release date for track ID {track_info['Track ID']}. Setting release date to only year.")
            # Set release_date to only the year
            release_date = parsed_date.strftime('%Y-01-01')
        except ValueError:
            print(f"ValueError: Failed to parse release date for track ID {track_info['Track ID']}. Setting release date to only year.")
            # Set release_date to only the year
            release_date = parsed_date.strftime('%Y-01-01')            
        except Exception as e:
            print(f"Failed to parse release date for track ID {track_info['Track ID']}: {str(e)}")
            # Default to None if parsing fails
            release_date = None

        json_data.append({
            'Track ID': track_info['Track ID'],
            'Track Name': track_info['Track Name'],
            'Artist(s)': track_info['Artist(s)'],
            'Album': track_info['Album'],
            'Release Date': release_date,
            'Popularity': track_info['Popularity'],
            'Duration (ms)': track_info['Duration (ms)'],
            'Explicit': track_info['Explicit'],
            'Artist ID(s)': track_info['Artist ID(s)'],
            'Track Number': track_info['Track Number'],
            'Album ID': track_info['Album ID'],
            'URI': track_info['URI'],
            'Scrapped Time': track_info['Scrapped Time']
        })

    # Save data to JSON file
    with open(output_json_file, 'w', encoding='utf-8') as json_file:
        json.dump(json_data, json_file, ensure_ascii=False, indent=4)

    print(f"Track information saved to {output_json_file}")

I need to know how to keep getting data from the API without the credentials being unusable, since doing calculations, the ratelimit that Spotify says is not exceeded. Otherwise, I need the API to return an error code to handle the exception and be able to pivot between multiple credentials.

After a complete and successful execution, I re-launch the script and at one point, the API stops at this point and does not return any errors. I re-launch the script with the same credentials, and it does not start, as they have been unusable for at least 24 hours.

Captura de pantalla 2024-04-09 a las 11 15 58

Environment:

  • macOS Sonoma 14.2.1
  • Python 3.11.2
  • spotipy last version
  • Pycharm

Additional context
Add any other context about the problem here.

Originally created by @omendo-galeo on GitHub (Apr 9, 2024). Original GitHub issue: https://github.com/spotipy-dev/spotipy/issues/1085 I am trying to extract information about artists, albums and songs. The problem comes when after one iteration (approximately 1000 songs, 1000 albums and 1200 artists), the API stops working, but without returning a 429 error code or anything, it just waits, so I can't handle exceptions of any kind. I have tried to make a custom session of requests to prevent it from staying in the retry state indefinitely but it does not react. This code is only to extract tracks info: ```python import csv import os import json import random import spotipy import time from datetime import datetime from dateutil import parser as date_parser from dateutil.parser import ParserError from tqdm import tqdm def open_chunk_id(n_json): csv_file_path = f'output_chunks/chunk_{n_json}.csv' with open(csv_file_path, newline='') as csvfile: track_ids_reader = csv.reader(csvfile) next(track_ids_reader) # Skip the header row track_ids = [row[0] for row in track_ids_reader] return track_ids def fetch_track_info_from_csv(track_ids, access_token): track_data = [] sp = spotipy.Spotify(auth=access_token) scrapped_time = datetime.now() for track_id in tqdm(track_ids, desc='Fetching track information', position=0): try: time.sleep(random.uniform(0, 0.5)) track_info = sp.track(str(track_id)) track_data.append({ 'Track ID': track_id, 'Track Name': track_info['name'], 'Artist(s)': [artist['name'] for artist in track_info['artists']], 'Album': track_info['album']['name'], 'Release Date': track_info['album']['release_date'], 'Popularity': track_info['popularity'], 'Duration (ms)': track_info['duration_ms'], 'Explicit': track_info['explicit'], 'Track Number': track_info['track_number'], 'URI': track_info['uri'].replace('spotify:track:', ''), 'Album ID': track_info['album']['id'], 'Artist ID(s)': [artist['id'] for artist in track_info['artists']], 'Scrapped Time': scrapped_time.strftime('%Y-%m-%d %H:%M:%S') }) except Exception as e: print(f"Failed to fetch information for track ID {track_id}: {str(e)}") return track_data def save_track_data_to_json(track_data, output_json_file): # Restructure data to match JSON format json_data = [] for track_info in track_data: try: parsed_date = date_parser.parse(track_info['Release Date']) if parsed_date.day == 1: # Case 1: Only year provided release_date = parsed_date.strftime('%Y-01-01') else: # Case 3: All info provided release_date = parsed_date.strftime('%Y-%m-%d') except ParserError: print(f"ParserError: Failed to parse release date for track ID {track_info['Track ID']}. Setting release date to only year.") # Set release_date to only the year release_date = parsed_date.strftime('%Y-01-01') except ValueError: print(f"ValueError: Failed to parse release date for track ID {track_info['Track ID']}. Setting release date to only year.") # Set release_date to only the year release_date = parsed_date.strftime('%Y-01-01') except Exception as e: print(f"Failed to parse release date for track ID {track_info['Track ID']}: {str(e)}") # Default to None if parsing fails release_date = None json_data.append({ 'Track ID': track_info['Track ID'], 'Track Name': track_info['Track Name'], 'Artist(s)': track_info['Artist(s)'], 'Album': track_info['Album'], 'Release Date': release_date, 'Popularity': track_info['Popularity'], 'Duration (ms)': track_info['Duration (ms)'], 'Explicit': track_info['Explicit'], 'Artist ID(s)': track_info['Artist ID(s)'], 'Track Number': track_info['Track Number'], 'Album ID': track_info['Album ID'], 'URI': track_info['URI'], 'Scrapped Time': track_info['Scrapped Time'] }) # Save data to JSON file with open(output_json_file, 'w', encoding='utf-8') as json_file: json.dump(json_data, json_file, ensure_ascii=False, indent=4) print(f"Track information saved to {output_json_file}") ``` I need to know how to keep getting data from the API without the credentials being unusable, since doing calculations, the ratelimit that Spotify says is not exceeded. Otherwise, I need the API to return an error code to handle the exception and be able to pivot between multiple credentials. After a complete and successful execution, I re-launch the script and at one point, the API stops at this point and does not return any errors. I re-launch the script with the same credentials, and it does not start, as they have been unusable for at least 24 hours. <img width="1330" alt="Captura de pantalla 2024-04-09 a las 11 15 58" src="https://github.com/spotipy-dev/spotipy/assets/166501690/63d2bd27-03a7-429c-b04f-074a46563ffe"> **Environment:** - macOS Sonoma 14.2.1 - Python 3.11.2 - spotipy last version - Pycharm **Additional context** Add any other context about the problem here.
kerem 2026-02-28 00:00:29 +03:00
  • closed this issue
  • added the
    bug
    label
Author
Owner

@tedwenn commented on GitHub (Apr 21, 2024):

I'm having the same issue with sp.album_tracks(). It just stops without returning any error. When I try stepping into function with the debugger, it never actually gets there. It just waits.

<!-- gh-comment-id:2068179139 --> @tedwenn commented on GitHub (Apr 21, 2024): I'm having the same issue with `sp.album_tracks()`. It just stops without returning any error. When I try stepping into function with the debugger, it never actually gets there. It just waits.
Author
Owner

@dieser-niko commented on GitHub (Apr 22, 2024):

If the script just freezes out of the blue, then it is probably because of Spotify's rate limit.

<!-- gh-comment-id:2068738509 --> @dieser-niko commented on GitHub (Apr 22, 2024): If the script just freezes out of the blue, then it is probably because of Spotify's rate limit.
Author
Owner

@tedwenn commented on GitHub (Apr 24, 2024):

Yeah, it's a rate limit issue. When I try just calling the API directly using requests, I get a 429. Something about how spotipy is wrapping the API is stalling, rather than returning the 429.

<!-- gh-comment-id:2075751478 --> @tedwenn commented on GitHub (Apr 24, 2024): Yeah, it's a rate limit issue. When I try just calling the API directly using `requests`, I get a 429. Something about how spotipy is wrapping the API is stalling, rather than returning the 429.
Author
Owner

@dieser-niko commented on GitHub (Apr 24, 2024):

If you want to stop the freezing, you can do something similar to this: https://github.com/spotipy-dev/spotipy/issues/766#issuecomment-1005102900

Edit: I've gotta admit, I don't know how the script will behave. My guess is that it's going to raise some kind of ratelimit error

<!-- gh-comment-id:2075803790 --> @dieser-niko commented on GitHub (Apr 24, 2024): If you want to stop the freezing, you can do something similar to this: https://github.com/spotipy-dev/spotipy/issues/766#issuecomment-1005102900 Edit: I've gotta admit, I don't know how the script will behave. My guess is that it's going to raise some kind of ratelimit error
Author
Owner

@dieser-niko commented on GitHub (Jul 10, 2024):

I'm going to close this issue, as there's no activity.

Also just a heads up, we are going to release some changes that include a warning when a request/rate limit is reached. This might help you to optimize your application.

<!-- gh-comment-id:2220340205 --> @dieser-niko commented on GitHub (Jul 10, 2024): I'm going to close this issue, as there's no activity. Also just a heads up, we are going to release some changes that include a warning when a request/rate limit is reached. This might help you to optimize your application.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/spotipy#645
No description provided.