[PR #1647] [MERGED] Don't do a multipart upload with first part smaller than 5MB #2089

Closed
opened 2026-03-04 02:03:38 +03:00 by kerem · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/s3fs-fuse/s3fs-fuse/pull/1647
Author: @CarstenGrohmann
Created: 5/4/2021
Status: Merged
Merged: 5/27/2021
Merged by: @gaul

Base: masterHead: noupload_on_space_shortage


📝 Commits (1)

  • d003832 Ensuring multipart size even when storage is low

📊 Changes

1 file changed (+4 additions, -0 deletions)

View changed files

📝 src/fdcache_entity.cpp (+4 -0)

📄 Description

When the temporary storage filled up, the old implementation starts a multipart upload with the current data even if the minimum multipart size is not reached. This can cause errors depending on the S3 implementation.

There is no real solution for a shortage of temporary storage. The change will implement 2 mitigations. They may help or may not. This depends on the speed of the incoming data vs. the speed of writing data to S3.

The new implementation handles two different scenarios:

  1. minimum part not reached: emits a warning and returns -ENOSPC

  2. temporary storage is between minimum part size and set multipart size: permanently reduce multipart size to the current size

    This scenario may cause the multipart size to be repeatedly, incrementally, and permanently reduced to the minimum size.

    There is no guarantee that this frees enough memory fast enough. As you see in "Example Scenario 2", the multipart is reduced but copy fails nevertheless.

    Maybe this can be solved by an additional ftruncate.

    This condition can be merged with the first to emit a warning and return -ENOSPC for all requests smaller than the multipart size.

What do you think about this change?

Starting s3fs

# ./s3fs mybucket /mnt -o url=http://mys3service,use_path_request_style,multipart_size=256,curldbg,dbglevel=debug -d -d -f
2021-05-04T09:12:18.519Z [INF] s3fs version 1.89(8c58ba8) : s3fs -o url=http://mys3service,use_path_request_style,multipart_size=256,curldbg,dbglevel=debug -d -d -f mybucket /mnt
2021-05-04T09:12:18.520Z [CRT] s3fs_logger.cpp:LowSetLogLevel(219): change debug level from [CRT] to [DBG]
2021-05-04T09:12:18.520Z [INF]     s3fs.cpp:set_mountpoint_attribute(4067): PROC(uid=0, gid=0) - MountPoint(uid=0, gid=0, mode=40700)
2021-05-04T09:12:19.522Z [DBG] curl.cpp:InitMimeType(408): Try to load mime types from /etc/mime.types file.
2021-05-04T09:12:19.522Z [DBG] curl.cpp:InitMimeType(413): The old mime types are cleared to load new mime types.
2021-05-04T09:12:19.523Z [INF] curl.cpp:InitMimeType(436): Loaded mime information from /etc/mime.types
2021-05-04T09:12:19.524Z [INF] fdcache_stat.cpp:CheckCacheFileStatTopDir(79): The path to cache top dir is empty, thus not need to check permission.
[...]

Example Scenario 1

# df -k /tmp/
Filesystem     1K-blocks    Used Available Use% Mounted on
/dev/sda3        3997376 3769464      1816 100% /tmp

# cp -p /local/bigfile /mnt/
cp: error writing "/mnt/bigfile": No space left on device
cp: failed to extend "/mnt/bigfile": No space left on device

New debug output w/ warning:

unique: 36, opcode: WRITE (16), nodeid: 2, insize: 65616, pid: 1601
write[5] 65536 bytes to 1835008 flags: 0x8001
2021-05-04T09:14:13.674Z [DBG] s3fs.cpp:s3fs_write(2323): [path=/bigfile][size=65536][offset=1835008][fd=5]
2021-05-04T09:14:13.674Z [DBG] fdcache.cpp:ExistOpen(529): [path=/bigfile][fd=5][ignore_existfd=false]
2021-05-04T09:14:13.674Z [DBG] fdcache.cpp:Open(446): [path=/bigfile][size=-1][time=-1]
2021-05-04T09:14:13.674Z [DBG] fdcache_entity.cpp:Dup(248): [path=/bigfile][fd=5][refcnt=2]
2021-05-04T09:14:13.674Z [DBG] fdcache_entity.cpp:Open(317): [path=/bigfile][fd=5][size=-1][time=-1]
2021-05-04T09:14:13.674Z [DBG] fdcache_entity.cpp:Dup(248): [path=/bigfile][fd=5][refcnt=3]
2021-05-04T09:14:13.674Z [DBG] fdcache_entity.cpp:Close(202): [path=/bigfile][fd=5][refcnt=2]
2021-05-04T09:14:13.674Z [DBG] fdcache_entity.cpp:Write(1443): [path=/bigfile][fd=5][offset=1835008][size=65536]
2021-05-04T09:14:13.674Z [WAN] fdcache_entity.cpp:Write(1493): Not enough local storage to cache write request till multipart upload can start: [path=/bigfile][fd=5][offset=1835008][size=65536]
2021-05-04T09:14:13.674Z [WAN] s3fs.cpp:s3fs_write(2335): failed to write file(/bigfile). result=-28
2021-05-04T09:14:13.674Z [DBG] fdcache.cpp:Close(596): [ent->file=/bigfile][ent->fd=5]
2021-05-04T09:14:13.674Z [DBG] fdcache_entity.cpp:Close(202): [path=/bigfile][fd=5][refcnt=1]
   unique: 36, error: -28 (No space left on device), outsize: 16

Example Scenario 2

# df -k /tmp/
Filesystem     1K-blocks    Used Available Use% Mounted on
/dev/sda3        3997376 3600504    170776  96% /tmp

# cp -p /local/bigfile /mnt/
cp: overwrite "/mnt/bigfile"? y
cp: error writing "/mnt/bigfile": No space left on device
cp: failed to extend "/mnt/bigfile": No space left on device

The multipart size is reduced from 256M to 166MB:

unique: 2713, opcode: WRITE (16), nodeid: 2, insize: 65616, pid: 1729
write[5] 65536 bytes to 174850048 flags: 0x8001
2021-05-04T09:16:44.633Z [DBG] s3fs.cpp:s3fs_write(2323): [path=/bigfile][size=65536][offset=174850048][fd=5]
2021-05-04T09:16:44.633Z [DBG] fdcache.cpp:ExistOpen(529): [path=/bigfile][fd=5][ignore_existfd=false]
2021-05-04T09:16:44.633Z [DBG] fdcache.cpp:Open(446): [path=/bigfile][size=-1][time=-1]
2021-05-04T09:16:44.633Z [DBG] fdcache_entity.cpp:Dup(248): [path=/bigfile][fd=5][refcnt=2]
2021-05-04T09:16:44.633Z [DBG] fdcache_entity.cpp:Open(317): [path=/bigfile][fd=5][size=-1][time=-1]
2021-05-04T09:16:44.633Z [DBG] fdcache_entity.cpp:Dup(248): [path=/bigfile][fd=5][refcnt=3]
2021-05-04T09:16:44.633Z [DBG] fdcache_entity.cpp:Close(202): [path=/bigfile][fd=5][refcnt=2]
2021-05-04T09:16:44.633Z [DBG] fdcache_entity.cpp:Write(1443): [path=/bigfile][fd=5][offset=174850048][size=65536]
2021-05-04T09:16:44.633Z [WAN] fdcache_entity.cpp:Write(1498): Not enough local storage to fully cache write request, reduce multipart size permanently from 268435456 to 174850048 to start upload: [path=/bigfile][fd=5][offset=174850048][size=65536]
2021-05-04T09:16:44.633Z [INF]       curl.cpp:PreMultipartPostRequest(3468): [tpath=/bigfile]
2021-05-04T09:16:44.633Z [DBG] curl_handlerpool.cpp:GetHandler(81): Get handler from pool: rest = 30

and the writing still fails a few seconds later

unique: 8659, opcode: WRITE (16), nodeid: 2, insize: 61520, pid: 1729
write[5] 61440 bytes to 564465664 flags: 0x8001
2021-05-04T09:16:50.203Z [DBG] s3fs.cpp:s3fs_write(2323): [path=/bigfile][size=61440][offset=564465664][fd=5]
2021-05-04T09:16:50.203Z [DBG] fdcache.cpp:ExistOpen(529): [path=/bigfile][fd=5][ignore_existfd=false]
2021-05-04T09:16:50.203Z [DBG] fdcache.cpp:Open(446): [path=/bigfile][size=-1][time=-1]
2021-05-04T09:16:50.203Z [DBG] fdcache_entity.cpp:Dup(248): [path=/bigfile][fd=5][refcnt=2]
2021-05-04T09:16:50.203Z [DBG] fdcache_entity.cpp:Open(317): [path=/bigfile][fd=5][size=-1][time=-1]
2021-05-04T09:16:50.203Z [DBG] fdcache_entity.cpp:Dup(248): [path=/bigfile][fd=5][refcnt=3]
2021-05-04T09:16:50.203Z [DBG] fdcache_entity.cpp:Close(202): [path=/bigfile][fd=5][refcnt=2]
2021-05-04T09:16:50.203Z [DBG] fdcache_entity.cpp:Write(1443): [path=/bigfile][fd=5][offset=564465664][size=61440]
2021-05-04T09:16:50.203Z [ERR] fdcache_entity.cpp:Write(1519): pwrite failed. errno(28)
2021-05-04T09:16:50.203Z [WAN] s3fs.cpp:s3fs_write(2335): failed to write file(/bigfile). result=-28
2021-05-04T09:16:50.203Z [DBG] fdcache.cpp:Close(596): [ent->file=/bigfile][ent->fd=5]
2021-05-04T09:16:50.203Z [DBG] fdcache_entity.cpp:Close(202): [path=/bigfile][fd=5][refcnt=1]
   unique: 8659, error: -28 (No space left on device), outsize: 16

🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/s3fs-fuse/s3fs-fuse/pull/1647 **Author:** [@CarstenGrohmann](https://github.com/CarstenGrohmann) **Created:** 5/4/2021 **Status:** ✅ Merged **Merged:** 5/27/2021 **Merged by:** [@gaul](https://github.com/gaul) **Base:** `master` ← **Head:** `noupload_on_space_shortage` --- ### 📝 Commits (1) - [`d003832`](https://github.com/s3fs-fuse/s3fs-fuse/commit/d003832deed0be94b9ab36d4d214863f0cfd530f) Ensuring multipart size even when storage is low ### 📊 Changes **1 file changed** (+4 additions, -0 deletions) <details> <summary>View changed files</summary> 📝 `src/fdcache_entity.cpp` (+4 -0) </details> ### 📄 Description When the temporary storage filled up, the old implementation starts a multipart upload with the current data even if the minimum multipart size is not reached. This can cause errors depending on the S3 implementation. There is no real solution for a shortage of temporary storage. The change will implement 2 mitigations. They may help or may not. This depends on the speed of the incoming data vs. the speed of writing data to S3. The new implementation handles two different scenarios: 1. minimum part not reached: emits a warning and returns -ENOSPC 2. temporary storage is between minimum part size and set multipart size: permanently reduce multipart size to the current size This scenario may cause the multipart size to be repeatedly, incrementally, and permanently reduced to the minimum size. There is no guarantee that this frees enough memory fast enough. As you see in "Example Scenario 2", the multipart is reduced but copy fails nevertheless. Maybe this can be solved by an additional `ftruncate`. This condition can be merged with the first to emit a warning and return -ENOSPC for all requests smaller than the multipart size. What do you think about this change? ## Starting s3fs ``` # ./s3fs mybucket /mnt -o url=http://mys3service,use_path_request_style,multipart_size=256,curldbg,dbglevel=debug -d -d -f 2021-05-04T09:12:18.519Z [INF] s3fs version 1.89(8c58ba8) : s3fs -o url=http://mys3service,use_path_request_style,multipart_size=256,curldbg,dbglevel=debug -d -d -f mybucket /mnt 2021-05-04T09:12:18.520Z [CRT] s3fs_logger.cpp:LowSetLogLevel(219): change debug level from [CRT] to [DBG] 2021-05-04T09:12:18.520Z [INF] s3fs.cpp:set_mountpoint_attribute(4067): PROC(uid=0, gid=0) - MountPoint(uid=0, gid=0, mode=40700) 2021-05-04T09:12:19.522Z [DBG] curl.cpp:InitMimeType(408): Try to load mime types from /etc/mime.types file. 2021-05-04T09:12:19.522Z [DBG] curl.cpp:InitMimeType(413): The old mime types are cleared to load new mime types. 2021-05-04T09:12:19.523Z [INF] curl.cpp:InitMimeType(436): Loaded mime information from /etc/mime.types 2021-05-04T09:12:19.524Z [INF] fdcache_stat.cpp:CheckCacheFileStatTopDir(79): The path to cache top dir is empty, thus not need to check permission. [...] ``` ## Example Scenario 1 ``` # df -k /tmp/ Filesystem 1K-blocks Used Available Use% Mounted on /dev/sda3 3997376 3769464 1816 100% /tmp # cp -p /local/bigfile /mnt/ cp: error writing "/mnt/bigfile": No space left on device cp: failed to extend "/mnt/bigfile": No space left on device ``` **New debug output w/ warning:** ``` unique: 36, opcode: WRITE (16), nodeid: 2, insize: 65616, pid: 1601 write[5] 65536 bytes to 1835008 flags: 0x8001 2021-05-04T09:14:13.674Z [DBG] s3fs.cpp:s3fs_write(2323): [path=/bigfile][size=65536][offset=1835008][fd=5] 2021-05-04T09:14:13.674Z [DBG] fdcache.cpp:ExistOpen(529): [path=/bigfile][fd=5][ignore_existfd=false] 2021-05-04T09:14:13.674Z [DBG] fdcache.cpp:Open(446): [path=/bigfile][size=-1][time=-1] 2021-05-04T09:14:13.674Z [DBG] fdcache_entity.cpp:Dup(248): [path=/bigfile][fd=5][refcnt=2] 2021-05-04T09:14:13.674Z [DBG] fdcache_entity.cpp:Open(317): [path=/bigfile][fd=5][size=-1][time=-1] 2021-05-04T09:14:13.674Z [DBG] fdcache_entity.cpp:Dup(248): [path=/bigfile][fd=5][refcnt=3] 2021-05-04T09:14:13.674Z [DBG] fdcache_entity.cpp:Close(202): [path=/bigfile][fd=5][refcnt=2] 2021-05-04T09:14:13.674Z [DBG] fdcache_entity.cpp:Write(1443): [path=/bigfile][fd=5][offset=1835008][size=65536] 2021-05-04T09:14:13.674Z [WAN] fdcache_entity.cpp:Write(1493): Not enough local storage to cache write request till multipart upload can start: [path=/bigfile][fd=5][offset=1835008][size=65536] 2021-05-04T09:14:13.674Z [WAN] s3fs.cpp:s3fs_write(2335): failed to write file(/bigfile). result=-28 2021-05-04T09:14:13.674Z [DBG] fdcache.cpp:Close(596): [ent->file=/bigfile][ent->fd=5] 2021-05-04T09:14:13.674Z [DBG] fdcache_entity.cpp:Close(202): [path=/bigfile][fd=5][refcnt=1] unique: 36, error: -28 (No space left on device), outsize: 16 ``` ## Example Scenario 2 ``` # df -k /tmp/ Filesystem 1K-blocks Used Available Use% Mounted on /dev/sda3 3997376 3600504 170776 96% /tmp # cp -p /local/bigfile /mnt/ cp: overwrite "/mnt/bigfile"? y cp: error writing "/mnt/bigfile": No space left on device cp: failed to extend "/mnt/bigfile": No space left on device ``` **The multipart size is reduced from 256M to 166MB:** ``` unique: 2713, opcode: WRITE (16), nodeid: 2, insize: 65616, pid: 1729 write[5] 65536 bytes to 174850048 flags: 0x8001 2021-05-04T09:16:44.633Z [DBG] s3fs.cpp:s3fs_write(2323): [path=/bigfile][size=65536][offset=174850048][fd=5] 2021-05-04T09:16:44.633Z [DBG] fdcache.cpp:ExistOpen(529): [path=/bigfile][fd=5][ignore_existfd=false] 2021-05-04T09:16:44.633Z [DBG] fdcache.cpp:Open(446): [path=/bigfile][size=-1][time=-1] 2021-05-04T09:16:44.633Z [DBG] fdcache_entity.cpp:Dup(248): [path=/bigfile][fd=5][refcnt=2] 2021-05-04T09:16:44.633Z [DBG] fdcache_entity.cpp:Open(317): [path=/bigfile][fd=5][size=-1][time=-1] 2021-05-04T09:16:44.633Z [DBG] fdcache_entity.cpp:Dup(248): [path=/bigfile][fd=5][refcnt=3] 2021-05-04T09:16:44.633Z [DBG] fdcache_entity.cpp:Close(202): [path=/bigfile][fd=5][refcnt=2] 2021-05-04T09:16:44.633Z [DBG] fdcache_entity.cpp:Write(1443): [path=/bigfile][fd=5][offset=174850048][size=65536] 2021-05-04T09:16:44.633Z [WAN] fdcache_entity.cpp:Write(1498): Not enough local storage to fully cache write request, reduce multipart size permanently from 268435456 to 174850048 to start upload: [path=/bigfile][fd=5][offset=174850048][size=65536] 2021-05-04T09:16:44.633Z [INF] curl.cpp:PreMultipartPostRequest(3468): [tpath=/bigfile] 2021-05-04T09:16:44.633Z [DBG] curl_handlerpool.cpp:GetHandler(81): Get handler from pool: rest = 30 ``` **and the writing still fails a few seconds later** ``` unique: 8659, opcode: WRITE (16), nodeid: 2, insize: 61520, pid: 1729 write[5] 61440 bytes to 564465664 flags: 0x8001 2021-05-04T09:16:50.203Z [DBG] s3fs.cpp:s3fs_write(2323): [path=/bigfile][size=61440][offset=564465664][fd=5] 2021-05-04T09:16:50.203Z [DBG] fdcache.cpp:ExistOpen(529): [path=/bigfile][fd=5][ignore_existfd=false] 2021-05-04T09:16:50.203Z [DBG] fdcache.cpp:Open(446): [path=/bigfile][size=-1][time=-1] 2021-05-04T09:16:50.203Z [DBG] fdcache_entity.cpp:Dup(248): [path=/bigfile][fd=5][refcnt=2] 2021-05-04T09:16:50.203Z [DBG] fdcache_entity.cpp:Open(317): [path=/bigfile][fd=5][size=-1][time=-1] 2021-05-04T09:16:50.203Z [DBG] fdcache_entity.cpp:Dup(248): [path=/bigfile][fd=5][refcnt=3] 2021-05-04T09:16:50.203Z [DBG] fdcache_entity.cpp:Close(202): [path=/bigfile][fd=5][refcnt=2] 2021-05-04T09:16:50.203Z [DBG] fdcache_entity.cpp:Write(1443): [path=/bigfile][fd=5][offset=564465664][size=61440] 2021-05-04T09:16:50.203Z [ERR] fdcache_entity.cpp:Write(1519): pwrite failed. errno(28) 2021-05-04T09:16:50.203Z [WAN] s3fs.cpp:s3fs_write(2335): failed to write file(/bigfile). result=-28 2021-05-04T09:16:50.203Z [DBG] fdcache.cpp:Close(596): [ent->file=/bigfile][ent->fd=5] 2021-05-04T09:16:50.203Z [DBG] fdcache_entity.cpp:Close(202): [path=/bigfile][fd=5][refcnt=1] unique: 8659, error: -28 (No space left on device), outsize: 16 ``` --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
kerem 2026-03-04 02:03:38 +03:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/s3fs-fuse#2089
No description provided.