[GH-ISSUE #1541] readdir parallel GET and HEAD #810

Open
opened 2026-03-04 01:48:58 +03:00 by kerem · 2 comments
Owner

Originally created by @gaul on GitHub (Jan 31, 2021).
Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/1541

s3fs_readdir calls list_bucket which serially issues ListBucket, once per every 1,000 objects. After completion it calls readdir_multi_head which calls HeadObject in parallel, one request per object. Instead s3fs could overlap the calls to ListBucketRequest and PreHeadRequest to improve performance for large directories.

Also I wonder if there is an opportunity to call filler while requests are still completing to allow the application to make some progress. This might require a different locking mechanism.

Originally created by @gaul on GitHub (Jan 31, 2021). Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/1541 `s3fs_readdir` calls `list_bucket` which serially issues `ListBucket`, once per every 1,000 objects. After completion it calls `readdir_multi_head` which calls `HeadObject` in parallel, one request per object. Instead s3fs could overlap the calls to `ListBucketRequest` and `PreHeadRequest` to improve performance for large directories. Also I wonder if there is an opportunity to call `filler` while requests are still completing to allow the application to make some progress. This might require a different locking mechanism.
Author
Owner

@ggtakec commented on GitHub (Feb 12, 2021):

@gaul I posted a PR for this issue.
However, there seems to be no major performance change.
Other than this change, there may be another improvement, so if the PR is merged, I will try to deal with it.

<!-- gh-comment-id:778309614 --> @ggtakec commented on GitHub (Feb 12, 2021): @gaul I posted a PR for this issue. However, there seems to be no major performance change. Other than this change, there may be another improvement, so if the PR is merged, I will try to deal with it.
Author
Owner

@ggtakec commented on GitHub (Feb 13, 2021):

@gaul I also added PR for #1569.
This PR sets EPERM instead of EIO for 400 HTTP errors, and also registers EPERM objects in the Stats cache.
This will prevent you from sending an error HEAD request every time, which can be expected to improve performance.

<!-- gh-comment-id:778569938 --> @ggtakec commented on GitHub (Feb 13, 2021): @gaul I also added PR for #1569. This PR sets EPERM instead of EIO for 400 HTTP errors, and also registers EPERM objects in the Stats cache. This will prevent you from sending an error HEAD request every time, which can be expected to improve performance.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/s3fs-fuse#810
No description provided.