mirror of
https://github.com/Googolplexed0/zotify.git
synced 2026-04-25 06:15:55 +03:00
[GH-ISSUE #53] [Bug Report] Incorrect Podcast File Suffix and Not Detecting Existing Files #44
Labels
No labels
bug
considering
discussion
documentation
enhancement
enhancement
good first issue
help wanted
pull-request
question
stale
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/zotify#44
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @Ragnaran on GitHub (Aug 18, 2025).
Original GitHub issue: https://github.com/Googolplexed0/zotify/issues/53
Originally assigned to: @Googolplexed0 on GitHub.
Downloading podcasts results in failures. The ffmpeg stream identification routine is appending
\n STREAMto the resulting codec name, causing the EXT_MAP to fail and the file not being found.This PR solves it: https://github.com/Googolplexed0/zotify/pull/52
Note that existing files are not being correctly checked correctly, so the podcast is being downloaded again, every time, even if it already exists. This is a slightly different issue that I haven't had a chance to investigate yet.
(I appreciate the effort to actually identify the codec! I'd bet a steak dinner that all podcasts are encoded in vorbis, but who knows if that will ever change?)
@Ragnaran commented on GitHub (Aug 18, 2025):
This is about as far as I got with a more effective refactor, but it's not quite working yet. I'll take another stab tomorrow:
@Googolplexed0 commented on GitHub (Aug 19, 2025):
Thanks, I try to make this as robust as possible. Surprisingly, there are many that are hosted externally and encoded in mp3. That is why
.mp3is the fallback fallback for suffixes.@Googolplexed0 commented on GitHub (Aug 19, 2025):
Both issues should now be fixed. Thanks for the bug find and fix!
@Ragnaran commented on GitHub (Aug 20, 2025):
Sadly, duplicates are still getting ignored. The logic
episode_path_exists = Path(episode_path).is_file()is using thepath + ".tmp"file name + the file size. A successfully downloaded file gets moved to the extension, so it won't have a.tmpextension, so the duplication check will never succeed. That's while I tried iterating against files that might have the extension in the EXT_MAP.I'll see if I can fix it.
@Googolplexed0 commented on GitHub (Aug 20, 2025):
My implemented fix (
b8fd011) replaced the.is_file()with.glob(). If you are still seeing.is_file(), you may need to update to >= v0.9.23.@Ragnaran commented on GitHub (Aug 20, 2025):
I nailed it down. I believe the
stream.input_stream.sizemight be reported in 1024 increment chunks, while on-disk file sizes might not reach that. The Path(episode_path) methods might also have been mixed up resulting in checks againstPath(PurePath(episode_path))invocations - I couldn't be 100% sure. I've added a check to allow the on-disk filesize to be up to 1024 bytes smaller, but only by checking files that have a valid extension derived from the EXT_MAP; temp downloads are ignored.The problem with .glob() (in my opinion) is that it wouldn't account for multiple matching files, or files that got copied/renamed.
Anyhoo, I whipped up PR https://github.com/Googolplexed0/zotify/pull/59 and tested it successfully.
@Googolplexed0 commented on GitHub (Aug 21, 2025):
This isn't an issue with
.glob()? There is no way to implement duplicate detection based on a filename pattern that would account for files that have been renamed. This would require checking against something other than filenames by definition. Also not sure what you mean by multiple matching files. The only part that is wildcarded in the.glob()is the file suffix, which accounts for the error cases where the file suffix exists outside of the EXT_MAP.I do like this idea overall though. Will implement something similar.
@Ragnaran commented on GitHub (Aug 21, 2025):
Agreed. I figured "find the first possible file with a valid extension, check if the size is (close to) accurate, and if none exist, run the download. Anyhoo, thanks!