-
Notifications
You must be signed in to change notification settings - Fork 498
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
apitrace: coherent mapping not implemented #232
Comments
The same problem exists for GL client arrays. There's a thread starting at http://lists.freedesktop.org/archives/mesa-dev/2014-February/053328.html confirming the problem. |
Yep. The https://github.com/apitrace/apitrace/tree/virtual-memory-regions has some experimental code to improve support for the old GL client arrays by keeping track of the content in user memory. We'll need something along these lines. (Though GL client arrays is slightly harder, as we don't even know how big are the user regions.) For ARB_buffer_storage and AMD_pinned_memory we'll need to maintain a list of active user memory regions, and check for changes whenever there is a draw call. But first step would be to get some test cases for this on https://github.com/apitrace/apitrace-tests . If anybody is willing to help, this would be great way to do so. |
Thanks for adding the warning! Because of this issue, trace fails for Bioshock Infinite & DiRT Showndown (Steam games). Of the Open Source games, for example Supertuxkart v0.9.x traces replay wrong with GL 4.3 supporting 3D drivers (e.g. Mesa with GL & GLSL version and compute extension overrides): https://en.wikipedia.org/wiki/SuperTuxKart Are there other freely available use-cases where this issue breaks tracing? |
https://www.kernel.org/doc/Documentation/vm/soft-dirty.txt |
VulkanTools "vktrace" uses that, and in my testing it works fine with DOTA2. |
I'd like to implement tracing of the coherent mapping.
Some time ago vktrace dropped pagemap support and moved to mprotect (LunarG/VulkanTools@a8b5df4#diff-d658adcb226a1d34c995bd96a3c31291), maybe to get more unified code between platforms, maybe because of ugly loop which checked if page is dirty (I'm not exactly sure why it was needed). Here I assume that it isn't possible to protect the pointer returned by glMapBufferRange since it could be device memory. Is it correct? How I see the implementation: As @marekolsak said we would need a second malloc'd memory that will be returned to the app. It would be similar to vktrace and new apitrace's branch apitrace/memtrace-exception-handler (for dx11) We will protect the memory returned to the app for writes, then:
However it leaves the issue of reads where the most problematic case will be: In such case we won't be able to catch that read because after the first write all protections are removed (removing write protection implies removing of read protection). That would be a rare case and more common will be: Adding read protection and distinguishing between read and write faults will solve the common case for Windows and Linux on x86-64 (distinguishing read and write faults is not straightforward on Linux). And the full solution probably will be reading all buffers when app synchronizes gpu and cpu. I think at the moment it would be enough to implement the common case which will also have good enough performance. I'd like to get some feedback before trying to implement this. Thanks. |
@werman thanks for volunteering. Your plan matches pretty much what i had in mind too. In short, have a shadow buffer, and trap writes, The only detail I'd add is:
Reads are indeed tricker to handle with shadows. We could trap reads from the shadow too and trigger a flush from real memory to shadow, but there are always situations that might not be granular enough. If we really want to handle reads effectively, we'll need to track when buffers are bound for writing into GL contexts, so that we flush the reads to the shadow when the corresponding fence is signaled. (Thankfully there are way fewer bind points where a buffer can be bound for write on GL.) Another alternative for reads is to use the shadow just for memcompares, and pass the gpu accessible memory to the app. That is, compromise performance for correctness. If we could restrict this to the few eventual buffers that are used for reads, it would be a good compromise.
I agree. |
Good idea, thanks.
noted
So if we have mapped coherent buffers for read:
The question is where to check that the fence is signaled?
If we pass the gpu accessible memory to the app how we will track writes since the buffer could be readable and writable? I'm not sure I understand this correctly, could you explain it further? At the moment I have the next priority:
About the tests, I see that there is a simple test in the tests repo. I think I would need to make tests which check more cases and also check correctness of reads and writes. |
The rules of So, a way of dealing with this is:
Everytime that draw/etc that uses that buffer happens, we'd need to memcmp with the shadow, and emit a fake memcpy for touched regions. I agree it's not easy. And once can't distinguish the GPU writes from CPU writes. But it's a possibility.
Yes. Currently the tests are on a separate repos. I've been thinking it would be easier to merge with the main repo. But I'm not sure when I'll get around to do it. |
"If MAP_COHERENT_BIT is set and the server does a write, the app must Yes, I see, for some reason my thoughts moved in some weird other direction.
Thank you for the details.
I like the first method more =) Ok, I think I've got enough information to start implementing it. |
Both ARB_buffer_storage and AMD_pinned_memory allow to use coherent memory in shaders. As apitrace doesn't catch writes in this memory, the results are likely wrong.
AMD_pinned_memory could easily be disabled, but as ARB_buffer_storage is required for GL4.4, it's needed to be emulated.
tbh, I don't see any os Independent way to implement this but to have a second buffer synced before/after almost every call.
But there are useful os dependent ways: eg we could alloc a both write and read protected memory and only sync the accessed pages.
The text was updated successfully, but these errors were encountered: