-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HTTP Multipart/FormData support #3844
Conversation
src/http/common.cr
Outdated
@@ -223,6 +223,24 @@ module HTTP | |||
nil | |||
end | |||
|
|||
# Dequotes a RFC2616 quoted-string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe show example code?
src/http/multipart/generator.cr
Outdated
|
||
# Finite State Machine diagram: https://gist.github.com/RX14/221c1edfa98d1196711515d4b5c264eb | ||
|
||
# Generates a content type header with multipart subtype *subtype*, and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe "Returns"? I initially thought this was appended to the generated content (because it says "generates" and this is a Generator class)
src/http/multipart/generator.cr
Outdated
end | ||
|
||
# Yields an IO that can be used to write to a body part which is appended | ||
# to the multipart message with the given *headers*. Throws is `#finish` or |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Throws if"
require "spec" | ||
require "http" | ||
|
||
def mp_parse(delim, data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
private def
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was written before those existed!
src/http/multipart/parser.cr
Outdated
begin | ||
yield headers, body_io | ||
ensure | ||
# Read to end of body |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think here you can use IO#skip_to_end
method that I added some days ago
(I'm happy to see more cases where this method is useful ^_^)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think i'll send a PR which increases the buffer size of skip_to_end
, gets_to_end
so they trigger IO::Buffered
to skip the buffer (>=4096).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea. I can commit that directly. What buffer size would be good?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added this as a commit to this PR, you can cherry-pick it to master if you want.
src/http/formdata.cr
Outdated
# form_data = "--aA40\r\nContent-Disposition: form-data; name=\"field1\"\r\n\r\nfield data\r\n--aA40--" | ||
# HTTP::FormData.parse(IO::Memory.new(form_data), "aA40") do |field, io| | ||
# field # => "field1" | ||
# data # => "field data" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
io.gets_to_end # => "field data"
src/http/formdata.cr
Outdated
# | ||
# HTTP::FormData.parse(request) do |field, io| | ||
# field # => "field1" | ||
# data # => "field data" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
io.gets_to_end # => "field data"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's also missing how to access the meta
and headers
variables and what they are. Same above.
@RX14 Overall looks really good, I really like it! I don't know why I rejected it some time ago... sorry about that 😊 I made a few comments for fixes. After that I'll merge it, so we can have it for 0.20.4. Thank you for this! 🎉 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems to look good overall. It's quite a large pull request, so I didn't delve too deep, yet noted a few things.
I would rename "generate" to "build", which would fit better in the stdlib, overall (e.g. String.build
, XML.build
, ...).
The public API looks good. We can quickly iterate the fields, values and have direct IO access. Though making the IO access required is maybe too much. Yielding many variables, too. Maybe a single Data
struct, with #headers
, #name
, #value
(aka io.gets_to_end
or something more efficient), #io
, and moving methods from Metadata
directly to this struct would be nicer to work with.
Last but no least, everything is documented. Yet, I'm missing some high level examples for everyday use cases:
- how do I build/send a multipart body with HTTP::Client (string, file, ...);
- how do I parse the multipart body of a HTTP::Request (string, file, ...).
src/http/multipart.cr
Outdated
end | ||
|
||
# Parses the multipart boundary from the value of the Content-Type header, | ||
# or nil if the boundary was not found. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about?
Extracts the multipart boundary from the Content-Type header.
May returnnil
if the boundary was not found.
# | ||
# Please note that the IO object yielded by `#next` is only valid until the | ||
# block returns. | ||
class Parser |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about a Parser#each
iterator?
# Yields each part until the whole multipart message is consumed. See `#next` for details.
def each
while has_next?
self.next do |headers, io, meta, headers|
yield headers, io, meta, headers
end
end
end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That usecase is meant to be HTTP::Multipart.parse
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But it's missing headers and meta?
src/http/multipart/generator.cr
Outdated
end | ||
|
||
# Finalizes the multipart message, this method must be called before the | ||
# generated multipart message written to the IO is considered valid. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe?
this method must be called to properly end the multipart message.
src/http/formdata.cr
Outdated
# | ||
# HTTP::FormData.parse(request) do |field, io| | ||
# field # => "field1" | ||
# data # => "field data" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's also missing how to access the meta
and headers
variables and what they are. Same above.
src/http/formdata.cr
Outdated
# io.to_s # => "HTTP/1.1 200 OK\r\nContent-Type: multipart/form-data; boundary=\"boundary\"\r\n ... | ||
# ``` | ||
# | ||
# See: `FormData::Generator` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- An example explaining how to create/send a multipart request with
HTTP::Client
would be more interesting (common) than a multipart response (quite uncommon). - The example could use
generator.boundary
(automatically generated) which is a better practice than the fixed"boundary"
string. The example may then use whatever boundary that would have been generated.
def self.parse(io, boundary) | ||
parser = Parser.new(io, boundary) | ||
while parser.has_next? | ||
parser.next { |headers, io| yield headers, io } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't it missing meta
and headers
? If not, how are we supposed to access them? Same below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is plain multipart, not multipart/form-data. Field names and file metadata are specific to the multipart/form-data
mime-type.
src/http/multipart.cr
Outdated
|
||
# Yields a `Multipart::Generator` to the given block, returning the generated | ||
# message as a `String`. | ||
def self.generate(boundary = "--------------------------#{SecureRandom.urlsafe_base64(18)}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about extracting the default boundary generation to a helper method, to avoid repeating it over and over?
I agree with all of @ysbaddaden comments. Let's postpone this PR for after 0.20.4 because I guess some changes are needed yet and I wanted to prepare a release today. |
dde4072
to
6a3abb1
Compare
Updated the PR with my work today, github is fairly accurate on which points I have addressed and which I haven't. I'll split commits from the PR out into separate PRs on request. |
There seems to be some confusion between The For that reason, |
Yes, I got confused. Thanks for the explanation. Could you detail this in the docs for the different classes? Mostly insist on what they each apply to, and maybe note in In that light, I don't think the module helpers in |
src/http/formdata/parser.cr
Outdated
|
||
struct Part | ||
getter name : String | ||
getter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like there's some code missing in here, I guess it's WIP :)
6a3abb1
to
1981831
Compare
Sorry for the super-extended delay replying to these issue comments. I've done quite a bit of work on the PR, and it's ready to be reviewed again. |
spec/std/http/http_spec.cr
Outdated
@@ -7,6 +7,11 @@ describe HTTP do | |||
HTTP.parse_time("Sun, 06 Nov 1994 08:49:37 GMT").should eq(time) | |||
end | |||
|
|||
it "parses RFC1123 without day name" do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
space between RFC
and its number would do nice ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
1981831
to
e3ffafb
Compare
I've rebased onto master so the CI should be fixed on OSX now. @ysbaddaden, another review would be appreciated :) |
It'd be great to have this at next Crystal release please 🙏 👍 |
@RX14 sorry, I've been quite busy lately, and haven't found the time to review this, yet. |
src/http/common.cr
Outdated
# ``` | ||
def self.quote_string(string, io) | ||
string.each_byte do |byte| | ||
case QUOTE_LOOKUP_TABLE[byte] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We think it's better here to just do case
on the byte, without a lookup table. The code will be clearer and it won't be unsafe (because it uses Pointer).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do. Do we know if LLVM will transform case statements to lookup tables?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Am I going crazy or is this a compiler bug: https://gist.github.com/RX14/e2dd192c7cb0e7cfb2278c2c46f3bea3 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The when takes the value of the comparison, so its comparing against a bool
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So it's like you are doing if byte == (byte <= 0x1F)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@asterite I would have sworn that i've seen case used like that before. I guess I shouldn't code at 1am.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can definitely do that if you omit the case expression. Then it becomes like if elsif but with all the conditions aligned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@asterite committed the new version using case
, and it's even 4% faster!
src/http/formdata/generator.cr
Outdated
# builder.finish | ||
# io.to_s # => "--aA47\r\nContent-Disposition: form-data; name=\"name\"\r\n\r\njoe\r\n--aA47\r\nContent-Disposition: form-data; name=\"upload\"; filename=\"test.txt\"\r\n\r\nfile contents\r\n--aA47--" | ||
# ``` | ||
class Builder |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you rename the file (and the specs) to be named "builder" instead of "generator", now that the types are also named "Builder"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, sorry I thought I'd renamed them all! Thanks for the catch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@asterite Renamed them
e3ffafb
to
faec93a
Compare
faec93a
to
4fa73fb
Compare
@ysbaddaden next time please can you leave a PR review instead of a commit comment, the commit comments get a bit lost when I force-push. |
Interesting: this build failed due to lack of memory https://travis-ci.org/crystal-lang/crystal/jobs/201479637 |
Finally! 🎉 🎉 🎉 |
Lets try again!