Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change: [Linkgraph] Pause the game when linkgraph jobs lag (#6470) #7081

Merged

Conversation

JGRennison
Copy link
Contributor

Check if the job is still running one date fract tick before it is due
to join and if so pause the game until its done.
This avoids the main thread being blocked on a thread join, which appears
to the user as if the game is unresponsive, as the UI does not repaint
and cannot be interacted with.
Show if pause is due to link graph job in status bar, update network
messages.
This does not apply for network clients.

@PeterN
Copy link
Member

PeterN commented Jan 22, 2019

When does this pause happen? How can I reproduce it?

@nielsmh
Copy link
Contributor

nielsmh commented Jan 22, 2019

Link-graph updates run on a snapshot taken at in-game time intervals, and have a deadline also given in in-game time. If the real-time processing takes longer than simulating the in-game time, the game stalls, or with this patch pauses.

The easiest way to force a pause would probably be adding some sleeps into the link graph threads, I'm not sure what it would take to make a network sufficiently complex for link graph updates to be slow, but game simulation still run fast. Perhaps limiting the game to a single CPU core.

Copy link
Contributor

@nielsmh nielsmh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested this via inserting a CSleep(30000); in LinkGraphSchedule::Run to artificially delay jobs.

Seems to work as intended, although I've only tested it in singleplayer.

It doesn't play nice with the "Link graph delay" line in the frame rate window, it always shows 0.03 ms or there about even when the job is delayed by 10+ seconds.

@michicc
Copy link
Member

michicc commented Feb 20, 2019

Is the wrong "Link graph delay" line a problem to be fixed or something obsoleted by this PR?

@nielsmh
Copy link
Contributor

nielsmh commented Feb 21, 2019

I think you may as well remove the delay measurement entirely, since the main purpose of it was as an indicator for why the game might be freezing intermittently. Since this patch gives another indicator for this, the delay measurement is no longer necessary.

@michicc
Copy link
Member

michicc commented Feb 21, 2019

Well, I just tested in in multi-player and I'm afraid to say it doesn't work right there.

Doing the same simulation (CSleep) as @nielsmh I get a freeze and then directly consecutive pause and unpause messages.

@JGRennison
Copy link
Contributor Author

In multiplayer DoCommandP calls are executed on the next tick, and so the check needs to be done 2 ticks before the join tick instead of 1.
I hadn't noticed this in my initial implementation of the feature due to unrelated changes to tick handling such that it did work in multiplayer.

Alternatively the feature could be made single-player only, which also seems reasonable.

On the FPS window, hiding/removing the line when the feature is in operation seems reasonable.
It may still be useful in the case where the feature is not in effect (e.g. multiplayer client).

@Eddi-z
Copy link
Contributor

Eddi-z commented Feb 24, 2019

A few thoughts on this:

  • i don't think "pause" is the right usage here, maybe a new pause-like feature where the game simulation halts if tasks take too long, keeping the GUI reactive, but doesn't show a "game is paused" line.
  • maybe even go as far as making the whole game loop a separate thread, to further uncouple GUI ticks from game ticks
  • as for the measurement tool, maybe it should somehow measure each thread separately, but only add it up to total time if it actually stalls on join?

@JGRennison JGRennison force-pushed the link-graph-thread-join-pause-before-block branch from 2c8597f to d2ed1ed Compare March 10, 2019 19:20
src/saveload/afterload.cpp Outdated Show resolved Hide resolved
@JGRennison JGRennison force-pushed the link-graph-thread-join-pause-before-block branch from d2ed1ed to df958b2 Compare March 10, 2019 19:29
@JGRennison
Copy link
Contributor Author

JGRennison commented Mar 10, 2019

This should now work correctly in multiplayer.

The line in the FPS window should now only be present on network clients, where it is still useful.

i don't think "pause" is the right usage here, maybe a new pause-like feature where the game simulation halts if tasks take too long, keeping the GUI reactive, but doesn't show a "game is paused" line.

Can you clarify what such a pause-like feature would be, and how it differs from pause?
If the simulation halts due to tasks taking too long, this should be clearly indicated to the user.

maybe even go as far as making the whole game loop a separate thread, to further uncouple GUI ticks from game ticks

This is not realistic given the current codebase and architecture.

src/linkgraph/linkgraphschedule.cpp Outdated Show resolved Hide resolved
@JGRennison JGRennison force-pushed the link-graph-thread-join-pause-before-block branch from df958b2 to a5697dd Compare March 11, 2019 00:50
@nielsmh nielsmh dismissed michicc’s stale review March 11, 2019 07:45

Bad comment fixed

@michicc
Copy link
Member

michicc commented Mar 22, 2019

I've just tested this again, and pausing works fine in network games. Unpausing OTOH is sometimes causing a (short) hang on the client side. I'm not sure if this is really fixable though, after all the client might simply be slower than the server (I've run server and client on the same PC for this test though).

@stale
Copy link

stale bot commented Apr 21, 2019

This pull request has been automatically marked as stale because it has not had any activity in the last month.
Please feel free to give a status update now, ping for review, or re-open when it's ready.
It will be closed if no further activity occurs within 7 days.
Thank you for your contributions.

@stale stale bot added the stale Stale issues label Apr 21, 2019
@stale stale bot closed this Apr 28, 2019
@LordAro LordAro reopened this Dec 31, 2019
@TrueBrain TrueBrain added candidate: needs considering This Pull Request needs more opinions size: large This Pull Request is large in size; review will take a while and removed stale Stale issues labels Dec 14, 2020
)

Check if the job is still running two date fract ticks before it is due
to join, and if so pause the game until its done.
When loading a game, check if the game would block immediately due to
a job which is scheduled to be joined within two date fract ticks,
and if so pause the game until its done.
This avoids the main thread being blocked on a thread join, which appears
to the user as if the game is unresponsive, as the UI does not repaint
and cannot be interacted with.
Show if pause is due to link graph job in status bar, update network
messages.
This does not apply for network clients.
… on network clients

Network servers and single player clients do not block on thread joins
due to instead pausing shortly before the join is due.
@TrueBrain TrueBrain force-pushed the link-graph-thread-join-pause-before-block branch from a5697dd to e5c4486 Compare December 22, 2020 13:51
Copy link
Member

@TrueBrain TrueBrain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given this PR is 2 years old, 2 developers were positive towards it and both reviewed it at least once, I am pretty certain this is doing what it should.

I rebased this PR, fixed some coding-style issues (comments too long, no dot at end of line, really minor stuff), and tested it. It does what it states it does.

This fix is, however, only a partial fix. If the linkgraph job is still running when exiting the game, it still hangs as if it is unresponsive. I am guessing that is really minor, but possibly future PRs could cancel the job in those scenarios or something.

Either way, tnx for the PR! Time to merge this after 2 years ... (sorry it took so long)

@TrueBrain TrueBrain added candidate: yes This Pull Request is a candidate for being merged and removed candidate: needs considering This Pull Request needs more opinions labels Dec 22, 2020
@JGRennison
Copy link
Contributor Author

This fix is, however, only a partial fix. If the linkgraph job is still running when exiting the game, it still hangs as if it is unresponsive. I am guessing that is really minor, but possibly future PRs could cancel the job in those scenarios or something.

I've got this implemented in my branch JGRennison/OpenTTD-patches@375dbdb JGRennison/OpenTTD-patches@3f2c561 JGRennison/OpenTTD-patches@6bf4b67
I can look into making a PR for that later on.

In general I've got lots of linkgraph commits.

Either way, tnx for the PR! Time to merge this after 2 years ... (sorry it took so long)

Not to worry :)

@TrueBrain TrueBrain merged commit 7f0fefd into OpenTTD:master Dec 22, 2020
@JGRennison JGRennison deleted the link-graph-thread-join-pause-before-block branch January 9, 2024 19:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
candidate: yes This Pull Request is a candidate for being merged size: large This Pull Request is large in size; review will take a while
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants