Fix: HTTP::FormData parsing of multipart subparts #5683

stakach · 2018-02-07T04:30:51Z

For example post data see https://www.w3.org/TR/html401/interact/forms.html#h-17.13.4.2
In a multi-file upload the files are uploaded as a subpart with Content-Disposition: file; filename="file1.txt" this patch allows parsing of subparts using the standard library.

References: https://tools.ietf.org/html/rfc2388

   either of the "content-disposition: form-data"
   header or, in the case of multiple files, in a "content-disposition:
   file" header of the subpart.

For example post data see https://www.w3.org/TR/html401/interact/forms.html#h-17.13.4.2 In a multi-file upload the files are uploaded as a subpart with `Content-Disposition: file; filename="file1.txt"` this patch allows parsing of subparts using the standard library. References: https://tools.ietf.org/html/rfc2388 ``` either of the "content-disposition: form-data" header or, in the case of multiple files, in a "content-disposition: file" header of the subpart. ```

bew · 2018-02-07T04:36:46Z

src/http/formdata.cr

@@ -133,7 +133,7 @@ module HTTP::FormData

    parts = content_disposition.split(';')
    type = parts[0]
-    raise Error.new("Invalid Content-Disposition: not form-data") unless type == "form-data"
+  raise Error.new("Invalid Content-Disposition: not form-data or file") unless ["form-data", "file"].includes?(type)


I think you incorrectly removed 2 spaces at the beginning.

Also, you should use a Tuple instead of an array to avoid having to allocate the array, so something like: {"form-data", "file"}.includes? type.

thanks, will update

stakach · 2018-02-07T05:14:47Z

All good @bew ?
My use case:

bew · 2018-02-07T07:11:34Z

I don't have enough knowledge on how HTTP is implemented and how this might be used to validate the relevance & implementation of your PR, but language-wise it looks better yes 😃

jhass · 2018-02-07T08:42:19Z

Actually, an added spec or two would be nice!

straight-shoota · 2018-02-07T10:04:26Z

The error message actually already states that it should work with bot form-data and file yet the implementation only accepts the former.

The change is correct, but it should be supplemented by a spec.

RX14 · 2018-02-07T14:07:36Z

This is incorrect as per the RFC. Look at the example:

   Content-Type: multipart/form-data; boundary=AaB03x

   --AaB03x
   Content-Disposition: form-data; name="submit-name"

   Larry
   --AaB03x
   Content-Disposition: form-data; name="files"
   Content-Type: multipart/mixed; boundary=BbC04y

   --BbC04y
   Content-Disposition: file; filename="file1.txt"
   Content-Type: text/plain

   ... contents of file1.txt ...
   --BbC04y
   Content-Disposition: file; filename="file2.gif"
   Content-Type: image/gif
   Content-Transfer-Encoding: binary

   ...contents of file2.gif...
   --BbC04y--
   --AaB03x--

This is a multipart/mixed with each part containing a Content-Disposition: file header contained inside a multipart/form-data.

We should discuss how we want to handle this, but this is absolutely the incorrect fix and will not parse the above data.

RX14 · 2018-02-07T14:09:42Z

To make it clear what I mean, let me indent:

Content-Type: multipart/form-data; boundary=AaB03x

--AaB03x
Content-Disposition: form-data; name="submit-name"

Larry
--AaB03x
Content-Disposition: form-data; name="files"
Content-Type: multipart/mixed; boundary=BbC04y

   --BbC04y
   Content-Disposition: file; filename="file1.txt"
   Content-Type: text/plain

   ... contents of file1.txt ...
   --BbC04y
   Content-Disposition: file; filename="file2.gif"
   Content-Type: image/gif
   Content-Transfer-Encoding: binary

   ...contents of file2.gif...
   --BbC04y--
--AaB03x--

RX14

A spec would have shown this was the incorrect fix.

straight-shoota · 2018-02-07T15:24:31Z

@RX14 Good catch. From the initial description of the method and the error message it seemed obvious to be the intended behaviour. But when you look at the context of how it's used, it is clear that it won't work this way.

There needs to be extra effort to support multiple file payloads.

RX14 · 2018-02-07T15:25:42Z

I think most actual browsers don't do this, they just send multiple fields with the same name, instead of nesting multipart/*

stakach · 2018-02-07T20:03:33Z

@RX14 to your point, I am extracting the indented data and then wanting to parse it. Please see: https://github.com/spider-gazelle/action-controller/blob/master/src/action-controller/body_parser.cr#L61

So this minor modification solves the issue. Otherwise I have to have an independent copy of this parsing code for something covered by the specification.

stakach · 2018-02-07T20:23:22Z

i.e. even if in the future if effort is taken to support multiple file payloads you can use this parsing code to extract those files.

Supporting the two types of Content-Dispositions makes sense at this level in the process.

RX14 · 2018-02-07T20:38:48Z

This class is for parsing multipart/form-data though, you're trying to parse multipart/mixed so you should use HTTP::Multipart. So this error is correct: you're trying to parse the incorrect format.

We should have some way to parse a Content-Disposition: file header, I entirely agree. However I think the way we want to do that should be discusses in a new issue, and this PR closed.

RX14 · 2018-02-07T20:40:41Z

The HTTP::FormData parser is actually a tiny wrapper around HTTP::Multipart so it should be easy to switch.

stakach · 2018-02-07T21:16:30Z

Thanks for the tip @RX14 - I'll re-factor using HTTP::Multipart

RX14 · 2018-02-07T23:24:21Z

@stakach You'll still have to parse the Content-Disposition header yourself. This is absolutely non-ideal. Please would you submit a feature-request for a way to parse Content-Dispostion: file headers? Thanks.

bew reviewed Feb 7, 2018

View reviewed changes

use a tuple for Content-Disposition comparison

2905894

jhass approved these changes Feb 7, 2018

View reviewed changes

jhass requested a review from RX14 February 7, 2018 08:42

RX14 requested changes Feb 7, 2018

View reviewed changes

stakach closed this Feb 7, 2018

bew mentioned this pull request Feb 8, 2018

Support parsing Content-Disposition: file headers #5686

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix: HTTP::FormData parsing of multipart subparts #5683

Fix: HTTP::FormData parsing of multipart subparts #5683

stakach commented Feb 7, 2018

bew Feb 7, 2018

stakach Feb 7, 2018

stakach commented Feb 7, 2018 •

edited

Loading

bew commented Feb 7, 2018

jhass commented Feb 7, 2018

straight-shoota commented Feb 7, 2018

RX14 commented Feb 7, 2018 •

edited

Loading

RX14 commented Feb 7, 2018

RX14 left a comment

straight-shoota commented Feb 7, 2018

RX14 commented Feb 7, 2018

stakach commented Feb 7, 2018

stakach commented Feb 7, 2018

RX14 commented Feb 7, 2018 •

edited

Loading

RX14 commented Feb 7, 2018 •

edited

Loading

stakach commented Feb 7, 2018

RX14 commented Feb 7, 2018

Fix: HTTP::FormData parsing of multipart subparts #5683

Fix: HTTP::FormData parsing of multipart subparts #5683

Conversation

stakach commented Feb 7, 2018

bew Feb 7, 2018

Choose a reason for hiding this comment

stakach Feb 7, 2018

Choose a reason for hiding this comment

stakach commented Feb 7, 2018 • edited Loading

bew commented Feb 7, 2018

jhass commented Feb 7, 2018

straight-shoota commented Feb 7, 2018

RX14 commented Feb 7, 2018 • edited Loading

RX14 commented Feb 7, 2018

RX14 left a comment

Choose a reason for hiding this comment

straight-shoota commented Feb 7, 2018

RX14 commented Feb 7, 2018

stakach commented Feb 7, 2018

stakach commented Feb 7, 2018

RX14 commented Feb 7, 2018 • edited Loading

RX14 commented Feb 7, 2018 • edited Loading

stakach commented Feb 7, 2018

RX14 commented Feb 7, 2018

stakach commented Feb 7, 2018 •

edited

Loading

RX14 commented Feb 7, 2018 •

edited

Loading

RX14 commented Feb 7, 2018 •

edited

Loading

RX14 commented Feb 7, 2018 •

edited

Loading