[PR #2136] [MERGED] Fixed a bug in handling file names containing CR(0x0D) #2351

Closed
opened 2026-03-04 02:05:05 +03:00 by kerem · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/s3fs-fuse/s3fs-fuse/pull/2136
Author: @ggtakec
Created: 3/21/2023
Status: Merged
Merged: 3/26/2023
Merged by: @ggtakec

Base: masterHead: fix_name_with_cr


📝 Commits (1)

  • 59c3b26 Fixed a bug in handling file names containing CR(0x1D)

📊 Changes

9 files changed (+247 additions, -5 deletions)

View changed files

📝 .gitignore (+1 -0)
📝 src/s3fs.cpp (+8 -1)
📝 src/s3fs_xml.cpp (+10 -3)
📝 src/string_util.cpp (+83 -0)
📝 src/string_util.h (+6 -0)
📝 src/test_string_util.cpp (+50 -0)
📝 test/Makefile.am (+3 -1)
test/cr_filename.c (+76 -0)
📝 test/integration-test-main.sh (+10 -0)

📄 Description

Relevant Issue (if applicable)

#2067

Details

libxml2 converts CR code('\r'=0x0D) to LF('\n'=0x0A) according to the XML specification.

Therefore, if the object name(file name) contains CR, ListBucket will try to process the object name converted to LF.
And as a result, there was a problem that the user could not get the file list.

The fix is to escape(%0D) the CR code from ListBucket results before libxml2's XML parsing and restore it after parsing.
The encoding uses the same technique as URL encoding.
Within XML content, HTML encoding(
) is appropriate, but in this case escaped and unescaped strings are indistinguishable, so it was not possible.

In this fix, we added a unit test(in test_string_util.cpp) for the string manipulation function(test_cr_encoding).
Also added cr_filename.c for testing filenames containing CR code. (It was added with C code because it was difficult to create a CR code file name from the shell script)


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/s3fs-fuse/s3fs-fuse/pull/2136 **Author:** [@ggtakec](https://github.com/ggtakec) **Created:** 3/21/2023 **Status:** ✅ Merged **Merged:** 3/26/2023 **Merged by:** [@ggtakec](https://github.com/ggtakec) **Base:** `master` ← **Head:** `fix_name_with_cr` --- ### 📝 Commits (1) - [`59c3b26`](https://github.com/s3fs-fuse/s3fs-fuse/commit/59c3b26df2bb39894f4321f918c6a11cea336238) Fixed a bug in handling file names containing CR(0x1D) ### 📊 Changes **9 files changed** (+247 additions, -5 deletions) <details> <summary>View changed files</summary> 📝 `.gitignore` (+1 -0) 📝 `src/s3fs.cpp` (+8 -1) 📝 `src/s3fs_xml.cpp` (+10 -3) 📝 `src/string_util.cpp` (+83 -0) 📝 `src/string_util.h` (+6 -0) 📝 `src/test_string_util.cpp` (+50 -0) 📝 `test/Makefile.am` (+3 -1) ➕ `test/cr_filename.c` (+76 -0) 📝 `test/integration-test-main.sh` (+10 -0) </details> ### 📄 Description ### Relevant Issue (if applicable) #2067 ### Details libxml2 converts `CR` code(`'\r'`=`0x0D`) to `LF`(`'\n'`=`0x0A`) according to the XML specification. Therefore, if the object name(file name) contains `CR`, `ListBucket` will try to process the object name converted to `LF`. And as a result, there was a problem that the user could not get the file list. The fix is to escape(`%0D`) the `CR` code from `ListBucket` results before libxml2's XML parsing and restore it after parsing. _The encoding uses the same technique as URL encoding._ Within XML content, HTML encoding(`&#13;`) is appropriate, but in this case escaped and unescaped strings are indistinguishable, so it was not possible. In this fix, we added a unit test(in `test_string_util.cpp`) for the string manipulation function(`test_cr_encoding`). Also added `cr_filename.c` for testing filenames containing `CR` code. (It was added with C code because it was difficult to create a `CR` code file name from the shell script) --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
kerem 2026-03-04 02:05:05 +03:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/s3fs-fuse#2351
No description provided.