-
-
Notifications
You must be signed in to change notification settings - Fork 949
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
With lots of vehicles, PerformanceAccumulator has a large performance impact itself #7247
Comments
Link to savegame: https://fuzzle.org/~petern/ottd/wentbourne.sav |
Unfortunately fixing this will probably require rather large restructuring of the vehicle ticks code, to separate vehicle types into each their own arrays, so all road vehicles, trains, etc. can be processed as single groups. It might be possible to somehow compile two versions of the vehicle tick functions, one with and one without measurements, and change the dynamic dispatch depending on a setting, but that's likely also tricky to get right. |
Calculating benchmarks from the initial comment in this issue... (Note: tps = ticks per second (sim rate), different from fps = frames per second (graphics rate). This is despite OpenTTD listing sim rate and graphics rate in the same units)
Therefore, the Vehicle PerformanceAccumulator takes ~3 microseconds (on a mid-range CPU) per vehicle per tick. Not a lot, but it adds up over thousands of vehicles to ~42 mspt at 14K vehicles. |
There's way more vehicles than that, each non-front train vehicle is also counted individually, despite an earlier exit in its tick handler. In this save most trains are 7 tiles long, so 4833 * 14 = 67662 rail vehicles, and that's not counting the longer ones. |
Re-calculating based on PeterN's comment:
The script code is as follows (put this in Start() in a gamescript to run):
|
The performance cost of PerformanceAccumulator varies significantly by platform, on my Linux machine the difference between 'Simulation rate (control)' and 'Simulation rate (PerformanceAccumulator disabled)' as described above on trunk is only about 7.7 ms/t vs 8.0 ms/t. FOR_ALL_VEHICLES_OF_TYPE iterates over the entire Vehicle array and dereferences each pointer to check the type, which is still somewhat expensive to do multiple times for each vehicle type. With the exception of effect vehicles, the vehicle array changes relatively infrequently. As the tick function is being called in typed groups, the tick function for the particular vehicle type can be called directly instead of using a virtual method call (e.g. |
Interesting that it doesn't affect performance significantly for you. I am running on Linux as well, however it's within a VM under Windows. And yes, separate lists would be better, but for me even with the extra iterating the improvement is so significant that it's worth doing. |
maybe depends on |
Hmm, I guess I will have to test native on both Linux and Windows. |
I noticed significant performance drop of 1.9.1 in comparision to 1.8 on Arch Linux x86_64. I'm not enough skilled to find and disable PerformanceAccumulator, but it indeed affected FPS, lowering it twice, with double high of CPU load (which dropped immediately to nearly zero when paused). Lowering cargo distribution accuracy and elongating graph recalculation time didn't help. I noticed it playing previously saved game, with lots of vehicles. |
Roughly how many vehicles (including wagons) do you have in your game? |
225 trains x ~8 wagons + 105 buses x ~ 2 wagons + 27 ships + 64 planes = 1800 + 210 + 64 + 27 = 2101 |
The slowness on macOS is a separate issue, I believe. If the measurement of vehicle ticks processing was an issue in your case, the measurement of Game Loop total would be much higher than it is, but it's not. The issue in this ticket is specifically for very large games that have thousands of individual vehicles running. The frame rate measured in your specific case (14.97 fps) means there are 66.8 ms between the beginning of each iteration of the game loop, but the sum of the times (0.36 + 3.14 + 0.01) does not add up to that amount at all. Hence something outside of a measurement block must be the cause. (The PerformanceAccumulator time spent on vehicle ticks is not part of the vehicle tick times, but is part of the total game loop time.) |
@wousser Can I ask you to assist with some details on your situation over in #7644, which covers the bug you're seeing? Mainly just exact OS version, and preferably also which hardware you're running on, screen resolution OpenTTD is running at, and whether Fast Forward has any effect on the frame rate. |
Is this still relevant or should it be closed by #10055? |
#10055 is in itself already a nice improvement when there are lots of non-front-engine trains (wagons). However the problem described here is not quite fixed. Creating a |
After reviewing a profile of ~3 minutes of Wentbourne, I have 150.000 samples in the game thread. The breakdown per vehicle tick is as follows:
This totals to 3852, or roughly 2,6% of all samples. So there is approximately 2,6% overall performance improvements left on the table. This is of course a game with lots of vehicles which is not nearly every game. But then again: such games really benefit from any every performance improvement we can achieve. |
Version of OpenTTD
master-gef7e47a53a
Expected result
Performance meter monitors performance of vehicle ticks with minimal adverse affect.
Actual result
Due to large amount of individual timing, Performance meter consumes significant CPU itself when timing for lots of vehicles.
Steps to reproduce
On my particular system, the simulation rate in master is around 8 fps.
After disabling the PerformanceAccumulator, and no other changes, the simulation rate increases by 50% to around 12 fps.
The text was updated successfully, but these errors were encountered: