Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix LibC.accept segfault #4788

Merged
merged 1 commit into from Aug 4, 2017
Merged

Conversation

RX14
Copy link
Contributor

@RX14 RX14 commented Aug 4, 2017

Neither client_addr or client_addrlen are used in the following function, but out params were used. The man page says these two arguments should be NULL if they're unused. Once I made this change, compiling spec/std/http/client/client_spec.cr in release mode stopped segfaulting.

I'm not entirely sure why this segfaulted in the first place, but it's likely to do with how client_addrlen is used by the kernel as the length of the sockaddr*. I think out client_addrlen is sending a random number on the top of the stack to the kernel. @oprypin raised the point that this shouldn't matter, since we allocated sizeof(struct sockaddr) for client_addr, and surely the kernel wouldn't write more than that? Who knows, this commit fixes it at least.

@matiasgarciaisaia
Copy link
Member

The way the kernel has to know how big client_addr is is that you state it in client_addrlen. From Beej's Networking Guide:

addrlen is a local integer variable that should be set to sizeof(struct sockaddr_storage) before its address is passed to accept(). accept() will not put more than that many bytes into addr. If it puts fewer in, it'll change the value of addrlen to reflect that.

So if client_addrlen has a random, big number - then the kernel could try to write more bytes than actually allocated by the out prefix. I think its safe to assume the memory pointed by the out variable won't be initialized by Crystal. In fact, it's arguable that the client_addrlen parameter is an out one - it should be initialized before sending it to the syscall.

I'm not sure about the portability of passing NULL as the addrlen, anyway. I can see its valid in Linux man pages, but OSX ones doesn't state anything - not sure if its legal in every platform, or something platform-specific. Any idea on that?

Lastly, any link to a report of the issue would be great.

If we can get any idea of the portability of the fix, I'm 👍 with it.

@RX14
Copy link
Contributor Author

RX14 commented Aug 4, 2017

@matiasgarciaisaia I know that providing a random number as the length is wrong, but my point is that the kernel has no reason to write more than sizeof(struct addrlen) (which is what we allocated). Perhaps this is why we haven't been getting stack corruption on every HTTP and Socket server since 56dec21.

As for the portability: it's in posix.

Freebsd and openbsd man pages say exactly the same thing:

A null pointer may be specified for addr if the address information is not desired; in this case, addrlen is not used and should also be null.

@akzhan
Copy link
Contributor

akzhan commented Aug 4, 2017

Segfaulting is bad thing, but should be reviewed anywhere else.

It's OK to provide "Need no data" to accept.

LGTM.

@matiasgarciaisaia
Copy link
Member

Long shot here: is there any chance you're sometimes getting an IPv6 connection?

The "default" struct sockaddr is 16 bytes, as is the IPv4-specific struct sockaddr_in. But the IPv6-specific is struct sockaddr_in6 is in fact 28 bytes long (source), so that's a good reason for the kernel to write extra bytes in the struct. If it checks the addrlen and it says a gazillion bytes, that allows it to use the 28 it needs - but we had actually allocated just 16.

Since the NULL behaviour is POSIX-compliant, I'm merging the fix 👍

Please do share here a link to any specific instance of the bug, anyways.

@matiasgarciaisaia matiasgarciaisaia merged commit e85746f into crystal-lang:master Aug 4, 2017
@RX14
Copy link
Contributor Author

RX14 commented Aug 4, 2017

@matiasgarciaisaia the loopback interface is ipv6, that'd do it. The spec looks up localhost (it runs a test HTTP server) which is ::1 on my system.

@ysbaddaden ysbaddaden added this to the Next milestone Aug 16, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants