mirror of
https://github.com/shadps4-emu/shadPS4.git
synced 2026-04-26 00:05:58 +03:00
[PR #118] [MERGED] core: Rewrite thread local storage implementation #1307
Labels
No labels
Bloodborne
bug
contributor wanted
documentation
enhancement
frontend
good first issue
help wanted
linux
pull-request
question
release
verification progress
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/shadPS4#1307
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
📋 Pull Request Information
Original PR: https://github.com/shadps4-emu/shadPS4/pull/118
Author: @raphaelthegreat
Created: 4/30/2024
Status: ✅ Merged
Merged: 5/1/2024
Merged by: @raphaelthegreat
Base:
main← Head:main📝 Commits (1)
495c002core: Rewrite thread local storage implementation📊 Changes
11 files changed (+175 additions, -188 deletions)
View changed files
📝
.gitmodules(+3 -0)📝
CMakeLists.txt(+34 -32)📝
externals/CMakeLists.txt(+4 -1)➕
externals/xbyak(+1 -0)📝
src/common/logging/backend.cpp(+11 -21)📝
src/core/linker.cpp(+19 -9)📝
src/core/tls.cpp(+80 -112)📝
src/core/tls.h(+8 -4)📝
src/main.cpp(+0 -1)📝
src/video_core/texture_cache/texture_cache.cpp(+9 -6)📝
src/video_core/texture_cache/texture_cache.h(+6 -2)📄 Description
It's not uncommon for ps4 guest applications to launch and use many threads, which also necessitates handling thread local storage properly. In x86 thread local accesses are performed by loading the pointer in the fs segment register.
This is a problem as Windows doesn't allow you to change the value of this register to what the guest expects. Not quite true, see first replyOn master this is handled with a simple exception handler that will patch the value of the destination register with a thread_local buffer. This works fine but will be a problem later on. Obviously the performance impact is pretty large for any access. In addition, the new texture cache that does fault tracking also needs a custom exception handler, so they end up conflicting. Also, guest apps can use negative offsets when accessing the buffer, so the current implementation would trigger UB in these cases.
This PR attempts to fix all of the above, by using assembly trampolines instead of the exception handler. For storing the TLS image pointer, a new TLS slot is allocated from the parent process and the logic from wine's TlsGetValue is used to retrieve the value. This means we also don't have to rely on undefined/unused spaces in TEB structure to store our data. Each mov instruction from FS segment is patched with a jump to a trampoline that loads the actual pointer.
While at it, also fixed a problem with fault tracking that caused crashing in pngdec demo. The tracking was being performed in the texture cache page size, when it should be on 4KB boundary like the host/guest. Also bumped the cache page size to vastly reduce the amount of page table accesses.
🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.