[PR #2056] [CLOSED] Use cache to improve readdir performance #2319

Closed
opened 2026-03-04 02:04:55 +03:00 by kerem · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/s3fs-fuse/s3fs-fuse/pull/2056
Author: @huntersman
Created: 11/15/2022
Status: Closed

Base: masterHead: cache


📝 Commits (1)

  • a0ecab1 use cache to improve readdir performance

📊 Changes

2 files changed (+122 additions, -19 deletions)

View changed files

📝 configure.ac (+1 -1)
📝 src/s3fs.cpp (+121 -18)

📄 Description

Relevant Issue (if applicable)

#2051

Details

When a directory has a lot of files, s3fs_readdir will take a lot of time because of list_bucket and readdir_multi_head is time-consuming. One possible solution is that we maintain a map of the path and files under that path, and files stat information should also be cached. When we readdir from a certain path, we can get data from the map instead of sending request to s3.

However, there are two defects. Firstly, files should only be modified via s3fs, any modification on s3 server will not change the map. In other words, files are read only on s3. Secondly, map is not the best way because when we reboot the server, data will disappear, maybe we could use a database to store it.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/s3fs-fuse/s3fs-fuse/pull/2056 **Author:** [@huntersman](https://github.com/huntersman) **Created:** 11/15/2022 **Status:** ❌ Closed **Base:** `master` ← **Head:** `cache` --- ### 📝 Commits (1) - [`a0ecab1`](https://github.com/s3fs-fuse/s3fs-fuse/commit/a0ecab1aed7b4516fc1132378acee66839189ddd) use cache to improve readdir performance ### 📊 Changes **2 files changed** (+122 additions, -19 deletions) <details> <summary>View changed files</summary> 📝 `configure.ac` (+1 -1) 📝 `src/s3fs.cpp` (+121 -18) </details> ### 📄 Description ### Relevant Issue (if applicable) #2051 ### Details When a directory has a lot of files, `s3fs_readdir` will take a lot of time because of `list_bucket` and `readdir_multi_head` is time-consuming. One possible solution is that we maintain a map of the path and files under that path, and files stat information should also be cached. When we `readdir` from a certain path, we can get data from the map instead of sending request to s3. However, there are two defects. Firstly, files should only be modified via `s3fs`, any modification on s3 server will not change the map. In other words, files are read only on s3. Secondly, map is not the best way because when we reboot the server, data will disappear, maybe we could use a database to store it. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
kerem 2026-03-04 02:04:55 +03:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/s3fs-fuse#2319
No description provided.