mirror of
https://github.com/s3fs-fuse/s3fs-fuse.git
synced 2026-04-24 21:06:02 +03:00
[GH-ISSUE #2808] urlDecode() silently produces NUL bytes on malformed input #1304
Labels
No labels
bug
bug
dataloss
duplicate
enhancement
feature request
help wanted
invalid
need info
performance
pull-request
question
question
testing
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/s3fs-fuse#1304
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @CarstenGrohmann on GitHub (Feb 23, 2026).
Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/2808
urlDecode()converts invalid%XXsequences (e.g.%GG) into NUL bytes instead of rejecting or preserving them. A truncated%at end of string is silently dropped. Malformed input can originate from buggy S3 clients, tampered object metadata, or an untrusted S3 backend.Root Cause:
The hex-decoding ternary in
src/string_util.cpp:206,211falls through to0x00for non-hex characters. A truncated%at end of string causes a silentbreak, dropping the incomplete sequence.github.com/s3fs-fuse/s3fs-fuse@58b3dd6b16/src/string_util.cpp (L195-L216)Reproduction:
Proposed Fix:
Two options — feedback welcome:
curl_easy_unescape()— libcurl is already a required dependency and passes invalid sequences through literally. Needs a CURL* handle from the existing.urlDecode()in place — add proper error signaling (bool return) and pass invalid%XXthrough literally. No handle needed, but keeps custom code to maintain.Which approach would you prefer?
@gaul commented on GitHub (Feb 25, 2026):
This is a general issue of early s3fs code not having an error path. Signalling an error via a flag is probably the best we can do given the C++14 dependency.
I asked this is another PR but is this issue LLM-generated? I am asking due to the style and low value of the issue. While it is not wrong, these nits are not a top priority given the limited reviewer capacity. So if you are machine-generating these please use your best human judgement.
@CarstenGrohmann commented on GitHub (Feb 25, 2026):
Sorry for flooding you with nits. This issue and the remaining open ones all came from an AI-assisted audit with a broad prompt. They are in the same category — no harm in a trusted environment, but worth fixing for less trusted setups like third-party S3 backends.
How about I space out my issues and PRs, improve their focus, and keep it to roughly once a month?