[GH-ISSUE #2805] Set XML_PARSE_NONET on xmlReadMemory calls #1303

Open
opened 2026-03-04 01:52:58 +03:00 by kerem · 1 comment
Owner

Originally created by @CarstenGrohmann on GitHub (Feb 22, 2026).
Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/2805

All xmlReadMemory() calls in the codebase pass options = 0:

github.com/s3fs-fuse/s3fs-fuse@b41e43df8a/src/s3fs_xml.cpp (L467)
github.com/s3fs-fuse/s3fs-fuse@b41e43df8a/src/s3fs.cpp (L3673)
github.com/s3fs-fuse/s3fs-fuse@b41e43df8a/src/mpu_util.cpp (L115)

S3 responses never contain DTDs or entity references. Setting XML_PARSE_NONET (available since libxml2 2.6) would explicitly prevent the parser from making network requests during entity resolution. On libxml2 >= 2.13, XML_PARSE_NO_XXE can additionally disable all external entity and DTD loading.

Currently, the code relies on libxml2 >= 2.9 defaulting to safe behavior, but doesn't declare it explicitly. A one-line flag per call would fix this.

Without the flags an attacker who can manipulate S3 responses (MITM, rogue S3-compatible endpoint) could potentially:

  • read local files via XXE (libxml2 < 2.9)
  • trigger SSRF, e.g. to EC2 IMDS for credential theft (libxml2 < 2.9)
  • cause OOM via entity expansion / billion laughs (libxml2 < 2.11)

I'd be happy to submit a PR.

Originally created by @CarstenGrohmann on GitHub (Feb 22, 2026). Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/2805 All `xmlReadMemory()` calls in the codebase pass `options = 0`: https://github.com/s3fs-fuse/s3fs-fuse/blob/b41e43df8a548b1d88f408b374959c39efcd4ca0/src/s3fs_xml.cpp#L467 https://github.com/s3fs-fuse/s3fs-fuse/blob/b41e43df8a548b1d88f408b374959c39efcd4ca0/src/s3fs.cpp#L3673 https://github.com/s3fs-fuse/s3fs-fuse/blob/b41e43df8a548b1d88f408b374959c39efcd4ca0/src/mpu_util.cpp#L115 S3 responses never contain DTDs or entity references. Setting `XML_PARSE_NONET` (available since libxml2 2.6) would explicitly prevent the parser from making network requests during entity resolution. On libxml2 >= 2.13, `XML_PARSE_NO_XXE` can additionally disable all external entity and DTD loading. Currently, the code relies on libxml2 >= 2.9 defaulting to safe behavior, but doesn't declare it explicitly. A one-line flag per call would fix this. Without the flags an attacker who can manipulate S3 responses (MITM, rogue S3-compatible endpoint) could potentially: - read local files via XXE (libxml2 < 2.9) - trigger SSRF, e.g. to EC2 IMDS for credential theft (libxml2 < 2.9) - cause OOM via entity expansion / billion laughs (libxml2 < 2.11) I'd be happy to submit a PR.
Author
Owner

@ggtakec commented on GitHub (Feb 22, 2026):

@CarstenGrohmann Thank you.
It seems that there is no reference to an external entity when connecting to AWS S3, but it certainly seems like consideration is needed for S3 compatibility and routing.
If you are able to fix this, could you please submit a PR?
Thanks in advance for your great help.

<!-- gh-comment-id:3941157779 --> @ggtakec commented on GitHub (Feb 22, 2026): @CarstenGrohmann Thank you. It seems that there is no reference to an external entity when connecting to AWS S3, but it certainly seems like consideration is needed for S3 compatibility and routing. If you are able to fix this, could you please submit a PR? Thanks in advance for your great help.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/s3fs-fuse#1303
No description provided.