mirror of
https://github.com/AJaySi/ALwrity.git
synced 2026-04-25 00:45:54 +03:00
Page:
Gemini AI Audio Transcription python Module
Pages
AI FAQ Generator
ALwrity AI Content calendar generation process ‐ Phase 1
ALwrity AI Content calendar generation process ‐ Phase 2
ALwrity AI SEO Tools
ALwrity Google PageSpeed Insights AI Tool
ALwrity Interface first page explanation
ALwrity On Page SEO Analyzer AI Tool
ALwrity Open Graph Generator AI Tool
ALwrity Sitemap Analysis Tool for SEO
Alwrity AI Powered Blog Content Refresher
Alwrity AI Web Research Details for content writing
Alwrity AI Writer Configuration options
Alwrity Library Module Guide ‐ AI writer python code layout
Changing LLM Models in Alwrity AI Writer
Features of ALwrity AI writer
Gemini AI Audio Transcription python Module
Getting started with ALwrity AI writer
Home
How to Install and Run the Open Source AI News Writer Locally on Windows
How to use AI to blog from Audio
Steps to Get Started on AI‐Writer with Google Colab
No results
1
Gemini AI Audio Transcription python Module
ي edited this page 2025-01-28 19:24:03 +05:30
Overview
The gemini_audio_text.py module is designed to transcribe audio files using Google's Gemini Pro model. It includes functionality to load environment variables, configure the Google API, and handle audio transcription.
Functions
1. load_environment()
Description: Loads environment variables from a .env file.
def load_environment():
load_dotenv()
logger.info("Environment variables loaded successfully.")
2. configure_google_api()
Description: Configures the Google Gemini API for audio transcription.
Raises: ValueError if the GEMINI_API_KEY environment variable is not set.
def configure_google_api():
api_key = os.getenv("GEMINI_API_KEY")
if not api_key:
error_message = "Google API key not found. Please set the GEMINI_API_KEY environment variable."
logger.error(error_message)
raise ValueError(error_message)
genai.configure(api_key=api_key)
logger.info("Google Gemini API configured successfully.")
3. transcribe_audio(audio_file_path)
Description: Transcribes audio using Google's Gemini Pro model. Args:
audio_file_path (str): The path to the audio file to be transcribed. Returns:str: The transcribed text from the audio. ReturnsNoneif transcription fails. Raises:FileNotFoundErrorif the audio file is not found.
def transcribe_audio(audio_file_path):
try:
load_environment()
configure_google_api()
logger.info(f"Attempting to transcribe audio file: {audio_file_path}")
if not os.path.exists(audio_file_path):
error_message = f"FileNotFoundError: The audio file at {audio_file_path} does not exist."
logger.error(error_message)
raise FileNotFoundError(error_message)
model = genai.GenerativeModel(model_name="gemini-1.5-flash")
try:
audio_file = genai.upload_file(audio_file_path)
logger.info(f"Audio file uploaded successfully: {audio_file=}")
except FileNotFoundError:
error_message = f"FileNotFoundError: The audio file at {audio_file_path} does not exist."
logger.error(error_message)
raise FileNotFoundError(error_message)
except Exception as e:
logger.error(f"Error uploading audio file: {e}")
return None
try:
response = model.generate_content([
"Transcribe the following audio:",
audio_file
])
if response and hasattr(response, 'text'):
transcript = response.text
logger.info(f"Transcription successful:\n{transcript}")
return transcript
else:
logger.warning("Transcription failed: Invalid or empty response from API.")
return None
except Exception as e:
logger.error(f"Error during transcription: {e}")
return None
except Exception as e:
logger.error(f"An unexpected error occurred: {e}")
return None
Usage
- Ensure you have a
.envfile with the following environment variables:GEMINI_API_KEY: Your Google API key.
- Call the
transcribe_audiofunction with the path to your audio file:transcript = transcribe_audio("path/to/your/audio/file.wav")
Dependencies
ossysgoogle.generativeaidotenvloguru
Logging
The module uses the loguru library for logging to the console with colorized and formatted messages.
