-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Support http server request streaming #3395
Conversation
I think that we should get rid of |
@RX14 I think for an HTTP client this is less dangerous. For example if I want to consume GitHub's API to fetch a repo's info, the string will be relatively small so that won't be a problem. In those cases getting the body as a string is pretty convenient. And parsing JSON from a small string is much faster than from an IO, mostly because you can create string subslices from the original string, but with an IO you can't (at least easily) do that. For a server it is a bit more dangerous because someone can send a big body and you can't control that. So in that case I agree that only |
Unfortunately then you have an inconsistency. I would argue that the body object of every {server,client} {request,response} should be of the same type so that all types follow the same basic interface. Having wildly different interfaces for request and response depending on if it's client or server violates the principle of least surprise. Maybe both Response types should include a HTTP::Request module which provides basic common methods. I would also argue that the body should be read-once, and request and response types be immutable structs. This has performance advantages but also makes the implementations much simpler without having to synchronise many different fields with mutable caches. |
@asterite i can confirm that the memory stays the same. Even for really big files (1GB) for multipart parsing. However the parsing got 4x slower with this patch. Before: 6324.3 ms WDYT? |
@sdogruyol Cool! Parsing of what? :-) |
@asterite parsing of a 1 GB file (ubuntu-14.04-desktop.img) upload with |
59c35f3
to
aa12884
Compare
@sdogruyol I think it's expected that it's going to be slower, because parsing from a string or MemoryIO is faster than from an IO over a socket. However, it shouldn't be that slower. I profiled the code and found a few places where I could optimize things. Could you try again? (warning: I forced pushed). I don't think it will be equally fast as before, but maybe 2x slower or similar... |
Using a larger buffer size in IO.copy might help. 8KB is probably a good size to fit in L1 cache. |
Before: 26427.92ms More than 2X faster 🎉 But still 2 times slower than the previous one 😿 |
@sdogruyol Try using: buffer = uninitialized UInt8[8192]
while (len = src.read(buffer.to_slice).to_i32) > 0
dst.write buffer.to_slice[0, len]
end instead of |
BTW i do think that this is pretty good for 1.0. After that we can think of a way to speed it up more. |
Closed in favor of #3406 |
(work in progress: do not merge)
Right now
HTTP::Server
has a problem: it consumes the whole request body into a string, every time. So if you post a big body into a server then the memory consumption will go pretty high.To solve this, this PR splits
HTTP::Request
into two classes:HTTP::Client::Request
andHTTP::Server::Request
. TheHTTP::Server::Request
exposes two properties:body_io : IO?
anIO
to the request's body. Will benil
if the request has no body (for example in aGET
request)body : String
: if you call this, it will consume thebody_io
and give you back a String (and memoize it)The server will automatically read the request for you (without storing it into a string) if you write to the response. I tried posting big strings into the server and memory consumption would go up to a few gigas before, and now it remains stable at 1 MB.
This is a WIP because:
body
property, because invoking it can potentially bring back the old problem. Also invokingrequest.body_io.try &.gets_to_end
isn't very complex either, and it makes you think "Hmm... this could potentially load the whole thing into memory". But in most cases you could feed theIO
directly to a parse, for example to parse JSON or multipart data.Once we merge this, one last thing remains: supporting streaming http client requests. That is, if you want to do a big POST and instead of building a big string you want to write directly into the socket, you can't do that right now. I imagine providing a callback to
HTTP::Client::Request
where you write to theIO
would be OK. But this is the subject of another PR :-)This change is of course backwards incompatible, but necessary.