Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

persistence dies across server restarts when trying to reconnect via tcp #973

Closed
nightstalker6669 opened this issue Mar 27, 2015 · 13 comments

Comments

@nightstalker6669
Copy link

ok so i had a chat with the folks in IRC and they said that this might be a bug and that i should report it here. so here it goes. I have 4 computers (3 servers in 2 seperate racks, and 1 normal computer) that are sending/receiving information via tcpnet. while the server is running, they are all working fine, but when the server restarts they freeze up and can't reconnect to tcpnet unless i manually turn the computers on and off (can't ctrl-alt-c them either). i receive this error in console as well.

http://pastebin.com/rqTQjDAC

just so if you want to try to recreate it with my scripts i'll explain my lay out. i have 2 servers that act as keepalive servers. they talk to each other thru simple ping messages thru TCPnet just to make sure that they can talk to each other. then the 3rd server pings the keepalive servers periodically to make sure that it is still connected to TCPnet, and the normal computer does the same thing. now if any of the servers can't talk to the keepalive servers, or the keepalive servers can't talk to each other, then that means that connection was lost, and they attempt to reconnect.

here is where all the scripts i've used to make happen.
https://github.com/nightstalker6669/OC-AE2/tree/master/TCPnet%20testing

@LizzyTrickster
Copy link
Contributor

That's probably because the TCP sockets cannot be persisted. I guess TCPNet doesn't take that into account and thus breaks on trying to restore a connection to a socket that doesn't actually exist anymore

@nightstalker6669
Copy link
Author

sorry if i don't sound all that smart as i'm still new to these things, but i have been able to shutdown the luarocks server and bring it back up while my minecraft server was still going, and my scripts would reconnect after they realized that they weren't connected anymore. i don't know if that is the same as what you are describing, but i really wouldn't know the difference between the luarocks server shutting down, or the minecraft server shutting down, and why that would cause my OC computers to freeze for one thing, and not the other. this is how i tested what i wrote to make sure that it would still reconnect after it disconnected from luarocks.

@LizzyTrickster
Copy link
Contributor

guessing that the luarocks server is the external one, they will still connect because the java connection object they're using still exists. upon restarting minecraft that java socket can't be persisted so any operations on it fail.

@fnuecke might be able to make it re-create socket objects on restart but i'm not sure how easy it would be to do

@fnuecke
Copy link
Member

fnuecke commented Mar 27, 2015

Hmm, the to literally persist userdata part should be impossible, so that's definitely at least partially a bug in OC. I'll look into it.

@fnuecke
Copy link
Member

fnuecke commented Mar 27, 2015

Could you please try setting verbosePersistenceErrors to true in the config and see how the message in the log looks then?

@nightstalker6669
Copy link
Author

http://pastebin.com/MirJpxRZ

here you go sir, just as you requested. that is for the 4 computers that are all set up to reconnect to TCPnet. it only shows the error when you right click on the server rack/computer that is having the issue.

fnuecke added a commit that referenced this issue Mar 27, 2015
@fnuecke
Copy link
Member

fnuecke commented Mar 27, 2015

Thanks, that's a good lead for a start. Looks like some upvalues aren't properly wrapped when the computers are being saved while they're in the middle of a synchronized call. Please give dev build 480 a try and see if that fixes it.

@nightstalker6669
Copy link
Author

k i'll throw this on my test server and mess with it. thanks again for the quick response!

@nightstalker6669
Copy link
Author

@fnuecke i threw the dev build on my test server (mirrored server all the way down to the previous 15 mins on my live server) and it looks like that fixed everything. i can restart the server with my scripts running and they will re-establish a connection after a server restarts, and my OC computers don't freeze up anymore. unless you have any other need for this to be opened, i'm going to go ahead and close this. thanks again, and keep up the outstanding work!

@fnuecke
Copy link
Member

fnuecke commented Mar 27, 2015

Awesome, thanks for confirming!

Make sure to set verbosePersistenceErrors to false again, as it's quite the performance and memory hog while saving.

@nightstalker6669
Copy link
Author

yep already have. now the only thing i have noticed is that my computers won't re-establish a connection until a player gets near the computer, even with a dimensional anchor(chunk loader) loading the area. not a big deal because there is always someone near by, but if there is something else that can be done about that, it wouldn't hurt i don't think. but other then that i'm happy they just reconnect.

@fnuecke
Copy link
Member

fnuecke commented Mar 27, 2015

That's... odd. As long as the chunk is loaded the computers should run; there's no code that depends on an actual player being nearby anywhere in the mod, at least not to my knowledge (aka if that's the case it's some Minecraft idiosyncrasy I'm not aware of).

@magik6k
Copy link
Contributor

magik6k commented Mar 28, 2015

From my experience this may be caused by cauldron(assuming you use that)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants