Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: TCP handling code is not posix compliant #9142

Closed
wants to merge 1 commit into from
Closed

Fix: TCP handling code is not posix compliant #9142

wants to merge 1 commit into from

Conversation

ccapitalK
Copy link

Motivation / Problem

When calling send or receive on a non-blocking socket, we only check for EWOULDBLOCK for partial completion, even though POSIX mandates that the call could return either EWOULDBLOCK or EAGAIN. This means that on certain operating systems that return EAGAIN from these functions (and don't define them as being the same error code), all of the TCP code would be broken.

Description

We check whether errno is set to EWOULDBLOCK or EAGAIN, instead of just EWOULDBLOCK.

Limitations

I believe I've fixed every bit of code that uses EWOULDBLOCK/EAGAIN in this project. General POSIX compatibility may still be unsolved.

Checklist for review

Some things are not automated, and forgotten often. This list is a reminder for the reviewers.

  • The bug fix is important enough to be backported? (label: 'backport requested')
  • This PR affects the save game format? (label 'savegame upgrade')
  • This PR affects the GS/AI API? (label 'needs review: Script API')
    • ai_changelog.hpp, gs_changelog.hpp need updating.
    • The compatibility wrappers (compat_*.nut) need updating.
  • This PR affects the NewGRF API? (label 'needs review: NewGRF')

When calling send or receive on a non-blocking socket, we only check for
EWOULDBLOCK for partial completion, even though POSIX mandates that the
call could return either EWOULDBLOCK or EAGAIN.
@TrueBrain
Copy link
Member

I expect this to fail on our Windows CIs, but it is running now, so we will see in a moment :)

@ccapitalK
Copy link
Author

I expect this to fail on our Windows CIs, but it is running now, so we will see in a moment :)

Oh, I wasn't aware that the windows code also ran that code path. Doesn't winsock define its own error codes though? I don't understand how EWOULDBLOCK would work on windows in that case.

@TrueBrain
Copy link
Member

TrueBrain commented Apr 30, 2021

@TrueBrain
Copy link
Member

TrueBrain commented Apr 30, 2021

Bit of cross-posting, but I asked in the upstream ticket if serenity only fires EAGAIN and never EWOULDBLOCK, as that would allow different solutions.

Historical, these 2 could be triggered for different circumstances, but in reality the OSes (and libcs) we support don't do this anymore for decades now. So it makes me wonder if we should change all our if statements, or just guard against them being different in os_abstraction.h, and in case of Serenity, map EWOULDBLOCK to EAGAIN. But that does depend on knowing that they didn't implement the old-way-of-doing-stuff and that both can be fired from recv() and co :) As that would mean the solution this PR presents is the only way to go :)

The main reason I was triggered by this, as because for Windows there is no need to check 2 values, so it all becomes a bit derpy.

The other solution would be to make it a function in os_abstraction.h that specifically returns if the socket was blocking, meaning each OS can pick their flavour for this.

I might also be overthinking this completely :P

EDIT: upstream indicates they only fire EAGAIN, never EWOULDBLOCK.

@ccapitalK
Copy link
Author

ccapitalK commented Apr 30, 2021

Oh, so it's #define'd to be cross-platform. I'm very surprised that the Windows CI task passed then, maybe the Windows' provided winsock header also cheats the same way? :)

Serenity will probably end up defining both of them to the same number anyway (to simplify other ports), so this PR wouldn't really be needed on our end, but I figure it might be worth merging anyway, since it's a POSIX thing and other operating systems may assume it works this way. We could do a similar trick as the EWOULDBLOCK windows hack to make sure the windows version works, just add #define EAGAIN WSAEWOULDBLOCK as well.

@TrueBrain
Copy link
Member

TrueBrain commented Apr 30, 2021

Yeah, Windows is annoying. It defines EWOULDBLOCK and EAGAIN, which works fine with errno. But .. sockets do not use errno. Instead, they use WSAGetLastError and WSAWOULDBLOCK. They give 0 guarantees that EWOULDBLOCK and WSAWOULDBLOCK map to the same value. So yeah .. we need some lovely glue.

As alternative to your PR, maybe something like this works too:
master...TrueBrain:egain-vs-ewouldblock

Follows the portability advise of POSIX, without us having to worry about it ever again. Not sure it is a better solution, but while I fight that out with other devs, would you mind testing if that in fact also does solve the issue? :D

Tnx!

@ccapitalK
Copy link
Author

ccapitalK commented Apr 30, 2021

I pulled your branch, it works on serenity. I think that is a better way to solve it, I'll leave this PR open for now until you guys have figured out which approach you like. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants