[PR #1973] [MERGED] buffer_cache: Improve buffer cache locking contention #2412

Closed
opened 2026-02-27 21:16:25 +03:00 by kerem · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/shadps4-emu/shadPS4/pull/1973
Author: @raphaelthegreat
Created: 12/29/2024
Status: Merged
Merged: 1/2/2025
Merged by: @raphaelthegreat

Base: mainHead: locking


📝 Commits (6)

📊 Changes

7 files changed (+104 additions, -240 deletions)

View changed files

📝 src/video_core/buffer_cache/buffer_cache.cpp (+8 -15)
📝 src/video_core/buffer_cache/buffer_cache.h (+2 -2)
📝 src/video_core/buffer_cache/memory_tracker_base.h (+19 -42)
📝 src/video_core/buffer_cache/word_manager.h (+63 -178)
📝 src/video_core/multi_level_page_table.h (+9 -0)
📝 src/video_core/page_manager.cpp (+1 -1)
📝 src/video_core/page_manager.h (+2 -2)

📄 Description

There is significant performance being lost by the buffer cache locking being held during memory synchronization and invalidation. This lock is not required however and used to properly synchronize the underlying memory tracker. This PR is an alternative to https://github.com/shadps4-emu/shadPS4/pull/1952 (and many thanks to @hspir404 for discovering this).

Overview of changes:

  • IsRegionRegistered should now be thread-safe. Usage of operator[] is replaced by find() as the former will create a new L1 region in the map which is not thread-safe and unwanted even, find will not do that. Buffer access is also guarded by a shared lock so multiple threads can read from the slot vector, but if the gpu threads inserts a new one it will be properly synchronized. Creation of buffers is expensive and calls the driver so it uses a regular mutex.

  • Memory tracker has been significantly simplified by removing dead code or features we aren't used (such as variable region sizes). Instead those are hardcoded as constexpr variables which might also result in slightly better codegen. Each region now includes a separate lock that is acquired on state modifying functions, while region query can be performed without the lock. From experiments I did between mutex and spinlock, the latter was noticeably faster so its used here (possibly because it doesn't incur an expensive kernel sleep)

  • Memory tracker now only creates new regions on buffer cache upload. We dont need to create regions when marking regions as cpu or gpu dirty. This might reduce memory usage a bit and make invalidations slightly faster.

  • Page manager lock has also been switched to a spin lock. From testing this seems to give a further 1fps boost in cpu bottlenecked programs


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/shadps4-emu/shadPS4/pull/1973 **Author:** [@raphaelthegreat](https://github.com/raphaelthegreat) **Created:** 12/29/2024 **Status:** ✅ Merged **Merged:** 1/2/2025 **Merged by:** [@raphaelthegreat](https://github.com/raphaelthegreat) **Base:** `main` ← **Head:** `locking` --- ### 📝 Commits (6) - [`ca5bfd8`](https://github.com/shadps4-emu/shadPS4/commit/ca5bfd845d07579b7aa6cf03ebcd0b2144cc7c69) Improve buffer cache locking contention - [`bbe29fa`](https://github.com/shadps4-emu/shadPS4/commit/bbe29fa6aa6f2271d8fa3ee552df83c9df5fa60e) buffer_cache: Revert some changes - [`9840c03`](https://github.com/shadps4-emu/shadPS4/commit/9840c0353116be2c647ba5e23d40635f8bb1e28e) clang fmt 1 - [`41fc77f`](https://github.com/shadps4-emu/shadPS4/commit/41fc77f2a22f634e561bbc24ecd3714a616d0084) clang fmt 2 - [`04246cb`](https://github.com/shadps4-emu/shadPS4/commit/04246cb233b9fba41454db60d79ebfeaea9576e8) clang fmt 3 - [`70d7162`](https://github.com/shadps4-emu/shadPS4/commit/70d71622489a93f4ca84378bc3b1b40b5b15e552) buffer_cache: Fix build ### 📊 Changes **7 files changed** (+104 additions, -240 deletions) <details> <summary>View changed files</summary> 📝 `src/video_core/buffer_cache/buffer_cache.cpp` (+8 -15) 📝 `src/video_core/buffer_cache/buffer_cache.h` (+2 -2) 📝 `src/video_core/buffer_cache/memory_tracker_base.h` (+19 -42) 📝 `src/video_core/buffer_cache/word_manager.h` (+63 -178) 📝 `src/video_core/multi_level_page_table.h` (+9 -0) 📝 `src/video_core/page_manager.cpp` (+1 -1) 📝 `src/video_core/page_manager.h` (+2 -2) </details> ### 📄 Description There is significant performance being lost by the buffer cache locking being held during memory synchronization and invalidation. This lock is not required however and used to properly synchronize the underlying memory tracker. This PR is an alternative to https://github.com/shadps4-emu/shadPS4/pull/1952 (and many thanks to @hspir404 for discovering this). Overview of changes: * IsRegionRegistered should now be thread-safe. Usage of operator[] is replaced by find() as the former will create a new L1 region in the map which is not thread-safe and unwanted even, find will not do that. Buffer access is also guarded by a shared lock so multiple threads can read from the slot vector, but if the gpu threads inserts a new one it will be properly synchronized. Creation of buffers is expensive and calls the driver so it uses a regular mutex. * Memory tracker has been significantly simplified by removing dead code or features we aren't used (such as variable region sizes). Instead those are hardcoded as constexpr variables which might also result in slightly better codegen. Each region now includes a separate lock that is acquired on state modifying functions, while region query can be performed without the lock. From experiments I did between mutex and spinlock, the latter was noticeably faster so its used here (possibly because it doesn't incur an expensive kernel sleep) * Memory tracker now only creates new regions on buffer cache upload. We dont need to create regions when marking regions as cpu or gpu dirty. This might reduce memory usage a bit and make invalidations slightly faster. * Page manager lock has also been switched to a spin lock. From testing this seems to give a further 1fps boost in cpu bottlenecked programs --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
kerem 2026-02-27 21:16:25 +03:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/shadPS4#2412
No description provided.