mirror of
https://github.com/s3fs-fuse/s3fs-fuse.git
synced 2026-04-25 05:16:00 +03:00
[GH-ISSUE #2600] Refactoring about Thread / Curl Handle #1245
Labels
No labels
bug
bug
dataloss
duplicate
enhancement
feature request
help wanted
invalid
need info
performance
pull-request
question
question
testing
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/s3fs-fuse#1245
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @ggtakec on GitHub (Nov 10, 2024).
Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/2600
Originally assigned to: @ggtakec on GitHub.
Background
Currently, s3fs-fuse has complicated and cumbersome code related to threads and curl handles due to years of modifications and feature additions.
This is not the only cause, but a data race occurs in
Curl+OpenSSL(#1997 #2211 #2209 #2220), and in order to fix this, the composition of threads and curl handles needs to be changed.I believe that we can reduce the number of objects that are deployed in unnecessary dynamic memory, and reduce the complexity of managing them.
Current problems / issues
Thread management is complicated
When processing a request, there are three logic steps(included using and not using threads):
These things could be unified.
Currently it's getting more complicated and affecting maintenance.
Also, s3fs options related to this should be unified too.
Curl handles and OpenSSL Data Rase(#1997 #2211 #2209 #2220)
I have identified the cause of this Data Race.
It is a problem with the Curl Share handle settings/using and is related to
CURL_LOCK_DATA_SSL_SESSION.This
CURL_LOCK_DATA_SSL_SESSIONprohibits the Curl Handle from being used in another thread, as shown below:(It is not supported to share SSL sessions between multiple concurrent threads)[https://curl.se/libcurl/c/CURLSHOPT_SHARE.html]
s3fs pools Curl handles, but the threads that use these Curl handles are not pinned.
Therefore, the conditions for
CURL_LOCK_DATA_SSL_SESSIONare not met, soCURL_LOCK_DATA_SSL_SESSIONcannot be enabled as it should be.In the current code, it is enabled, which causes a data race.
Composition modifications to resolve
To solve this problem, we should change all request processing to run in a sub thread.
This will consolidate request processing and improve maintainability.
Also, each request can be processed synchronously or asynchronously.
We should integrate thread management into the
ThreadPoolManclass.This will allow all request processing to be done in parallel using a thread pool.
And it will also integrate the parallel concurrency options of s3fs.
A thread worker should hold one S3fsCurl, and the S3fsCurl should hold one Curl Handle.
This ensures that only one thread uses the Curl Handle.
Because the thread using the Curl Handle is pinned, you can enable a Curl Share Handle and
CURL_LOCK_DATA_SSL_SESSIONwithout a Data Race.About Pull Requesta
To address this issue, I will submit several PRs in succession.
These are essentially one modification, but I'm splitting them up to make the PRs easier to understand.
Note that these PRs will need to be merged consecutively, since it's actually a single modification code.
I will link all PRs to the reply to this issue.
@gaul
Please help me to move forward with the PR.
@ggtakec commented on GitHub (Nov 10, 2024):
@gaul
If you want to check the final code, please check the fix in #2608.
(However, I think it's better to check them in order...)
@ggtakec commented on GitHub (Nov 28, 2024):
All PRs have been merged so I will close this issue.
@gaul Thanks.