Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regular server desynchronisations from clients #8093

Closed
duckfullstop opened this issue Apr 21, 2020 · 14 comments
Closed

Regular server desynchronisations from clients #8093

duckfullstop opened this issue Apr 21, 2020 · 14 comments

Comments

@duckfullstop
Copy link
Contributor

Version of OpenTTD

1.10.1 - compiled from source, running in docker with redditopenttd/openttd

Expected result

The server to stay synchronised with clients.

Actual result

Players regularly desynchronise, usually within a couple of minutes of connecting (myself included).

Steps to reproduce

Thedesync=3 log files are linked below. I don't have the starting off save because I stupidly didn't save it, but this one occured within 15 minutes of launching the server.

Debug files: https://files.duck.moe/index.php/s/YBrimMsWZHBxATS

This has been occuring for a good few weeks now, especially since the 1.10 release.

@LordAro
Copy link
Member

LordAro commented Apr 26, 2020

Alright, I've got somewhere with reproducing this.

  1. Ran the command replay forward & forced a dmp_cmds save at 000ad053 (the same date that Hopsi joined, though 9 ticks before due to some weirdness in the desync replaying stuff)
  2. Started the game again with that new save (basically acting as if I'd joined the game at that point
  3. dbg: [net] sync check: 000ad0ef; 00; mismatch expected {33188c8c, 1d7b69f5}, got {007a3060, 2361519c} & and differences in the save game just before (ad0e0)
  4. using the zpipe & printhunk utilities, found the diff in the ad0e0 savegames (my generated one attached) - this pointed to a few bytes in the MAP2 chunk.

So, something in the map array that's using .m2 , presumably something that's changed since 1.9.x .
Probably.
dmp_cmds_086e2f4d_000ad0e0.zip

@JGRennison
Copy link
Contributor

Hopsi Transport Train 1 is in two completely different positions in the two savegames.

I had a look at one of the differences in M2 and it looks like the PBS bits for rail (class 1).
The first difference in M6 looks like the PBS bit for stations (class 5).

The dump utility seems to give up at the start of the VEHS chunk for some reason.

@JGRennison
Copy link
Contributor

The trains are also different lengths by 6 wagons. 6 wagons are left in the depot in the original save.
That seems a bit suggestive of a problem with the _new_vehicle_id/CcBuildWagon mechanism...

@nielsmh
Copy link
Contributor

nielsmh commented Apr 26, 2020

Might be related to the build+refit feature for vehicles.

@frosch123 noted on IRC:

the GroupStatistics stuff in the build+refit looks fishy for sure
the OrderBackup in there may even cause a desync

@frosch123
Copy link
Member

I have no time left today to test this, but pretty sure this is capable of causing desync:
frosch123@871ca6d

@JGRennison
Copy link
Contributor

Interestingly, getting a build and refit cost estimate re-arranges free wagons in the depot.
NormalizeTrainVehInDepot() is called to attach assorted free wagons to the temporary engine, and when the temporary engine is deleted the wagons are released in a single chain, which is a potentially different configuration.

@JGRennison
Copy link
Contributor

I have no time left today to test this, but pretty sure this is capable of causing desync:
frosch123@871ca6d

The group statistics are also updated when the temporary engine is sold, this change makes the statistics go negative. This aspect seems OK as it is.

@James103
Copy link
Contributor

James103 commented May 1, 2020

Interestingly, getting a build and refit cost estimate re-arranges free wagons in the depot.

I can reproduce this in 1.10.1 stable and in 2020-04-26 master nightly.

@glx22
Copy link
Contributor

glx22 commented May 1, 2020

Estimate cost is not even necessary it seems. I can "move" the wagons just by selecting an engine in the list while a cargo filter is set.

frosch123 added a commit to frosch123/OpenTTD that referenced this issue May 1, 2020
…t run, and use DC_AUTOREPLACE for actions that shall be reverted.
frosch123 added a commit to frosch123/OpenTTD that referenced this issue May 1, 2020
… and thus caused desyncs.

Use DC_AUTOREPLACE for actions that shall be reversibe.
frosch123 added a commit to frosch123/OpenTTD that referenced this issue May 1, 2020
…ctions.

Actual auto-replace does not set a 'user', so this does not affect auto-replace.
But it has the same vibe as auto-replace test runs.
@ldpl
Copy link
Contributor

ldpl commented May 2, 2020

Do I understand it correctly that to get that vehicle desync player needs to estimate build or refit cost in some special circumstances? If that's the case it may not be the main cause of most desyncs as many players seem to disconnect just by creating a company or shortly after.

19:44:33  <andythenorth> protocol error, connection closed
19:44:49  <andythenorth> do we have a desync log?
19:45:06  <_dp_> *** Player has left the game (wrong company in DoCommand)
19:45:09  <_dp_> o_O
19:45:12  <_dp_> was that you?
19:45:20  <andythenorth> yup
19:45:35  <andythenorth> triggers on 'New Company'

and some to even desync as spectators

<RedditServer1> 22:58:55> *** petrpletsvetr has joined
<RedditServer1> 22:58:58> 'petrpletsvetr' reported an error and is closing its connection (desync error)
<RedditServer1> 22:58:59> *** petrpletsvetr has left the game (desync error)

@ldpl
Copy link
Contributor

ldpl commented May 3, 2020

Oh, wow, I just got desynced myself while spectating and it put the game into the invalid state (same as in #6598). It appears to still be in the game but it's missing the client list and is not actually connected as seen in another game instance.
Although both server and client are modified in this case so can always be blamed on that I guess.
Screenshot from 2020-05-03 21-49-20

frosch123 added a commit to frosch123/OpenTTD that referenced this issue May 3, 2020
… and thus caused desyncs.

Use DC_AUTOREPLACE for actions that shall be reversibe, in this case:
- Do not rearrange free wagons in test-run.
- Do not discard OrderBackups.
The latter was not triggered by actual auto-replace, since it does not set a 'user'.
@duckfullstop
Copy link
Contributor Author

@frosch123 🎉 - reckon we’ll be seeing a point release for this any time soon?

@LordAro
Copy link
Member

LordAro commented May 3, 2020

I think that's quite likely, though we'll want to make sure that the problem has actually been fixed (or that there aren't any other problems like @ldpl is suggesting)

@ldpl
Copy link
Contributor

ldpl commented May 6, 2020

FWIW here how it looks from the player pov:

[Server #0] Player: but... what happened today...
[Server #0] Player: i have gone once...
[Server #0] Player: to return and i wanted to look: how much time is left to restart :-)
[Server #0] Player: and then:  i couldnt connect anymore
[Server #0] Player: i tried some 15 times or so
[Server #0] dP: what did it say when you couldn't connect?
[Server #0] Player: synchronisation error
[Server #0] dP: were you trying to connect to a company or as spectator?
[Server #0] Player: it said that after some 5 seconds or so
[Server #0] Player: both
[Server #0] Player: i tried to connect to the company first
[Server #0] Player: then as a spectator
[Server #0] Player: i was both times as "Player" then
[Server #0] Player: i mean, not both but 15 times
[Server #0] dP: so it just connected to the game normally and after a few seconds kicked you out with desync message?
[Server #0] Player: yes, seeemed so
[Server #0] dP: damn :(
[Server #0] dP: oh, and what os do you use?
[Server #0] Player: win7

So he was fine until he left the game himself and couldn't get back anymore.

LordAro pushed a commit to LordAro/OpenTTD that referenced this issue May 10, 2020
… and thus caused desyncs.

Use DC_AUTOREPLACE for actions that shall be reversibe, in this case:
- Do not rearrange free wagons in test-run.
- Do not discard OrderBackups.
The latter was not triggered by actual auto-replace, since it does not set a 'user'.
LordAro pushed a commit that referenced this issue Jun 1, 2020
…us caused desyncs.

Use DC_AUTOREPLACE for actions that shall be reversibe, in this case:
- Do not rearrange free wagons in test-run.
- Do not discard OrderBackups.
The latter was not triggered by actual auto-replace, since it does not set a 'user'.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants