Skip to content

Commit

Permalink
improved performance of multipart/form-data parser a little
Browse files Browse the repository at this point in the history
  • Loading branch information
kraih committed May 31, 2013
1 parent 87e1711 commit 52253ce
Show file tree
Hide file tree
Showing 5 changed files with 37 additions and 46 deletions.
4 changes: 2 additions & 2 deletions Changes
Expand Up @@ -2496,7 +2496,7 @@
- Added referrer method to Mojo::Headers. (esskar)
- Added finish_cb callback to Mojo::Message.
- Added render_data method to Mojolicious::Controller.
- Added formdata tests for multiple browsers. (koban)
- Added form-data tests for multiple browsers. (koban)
- Changed Mojolicious default secret to a slightly more secure value.
(xantus)
- Allow parser errors to be handled by frameworks.
Expand Down Expand Up @@ -2749,7 +2749,7 @@
- Fixed a few cases where exceptions and not found events would result in
empty pages.
- Fixed layouts with partial templates.
- Fixed encoding of non utf8 form data.
- Fixed encoding of non utf8 form-data.
- Fixed body callbacks to get automatic buffering.
- Fixed a case where Mojo::Server::Daemon and Mojo::Client were too
defensive and made them in turn 20 times faster.
Expand Down
59 changes: 25 additions & 34 deletions lib/Mojo/Message.pm
Expand Up @@ -47,10 +47,9 @@ sub body_params {
$params->parse($self->content->asset->slurp);
}

# "multipart/formdata"
# "multipart/form-data"
elsif ($type =~ m!multipart/form-data!i) {
!defined($_->[1]) && $params->append(@$_[0, 2])
for @{$self->_parse_formdata};
$params->append(@$_[0, 1]) for @{$self->_parse_formdata};
}

return $params;
Expand Down Expand Up @@ -205,17 +204,12 @@ sub uploads {
my $self = shift;

my @uploads;
for my $data (@{$self->_parse_formdata}) {

# Just a form value
next unless defined $data->[1];

# Uploaded file
for my $data (@{$self->_parse_formdata(1)}) {
my $upload = Mojo::Upload->new(
name => $data->[0],
filename => $data->[1],
asset => $data->[2]->asset,
headers => $data->[2]->headers
filename => $data->[2],
asset => $data->[1]->asset,
headers => $data->[1]->headers
);
push @uploads, $upload;
}
Expand Down Expand Up @@ -263,19 +257,17 @@ sub _limit {
}

sub _parse_formdata {
my $self = shift;
my ($self, $upload) = @_;

# Check for multipart content
my @formdata;
my $content = $self->content;
return \@formdata unless $content->is_multipart;
my $charset = $content->charset || $self->default_charset;

# Check all parts for form data
# Check all parts recursively
my @parts = ($content);
while (my $part = shift @parts) {

# Nested multipart content
if ($part->is_multipart) {
unshift @parts, @{$part->parts};
next;
Expand All @@ -285,19 +277,18 @@ sub _parse_formdata {
next unless my $disposition = $part->headers->content_disposition;
my ($name) = $disposition =~ /[; ]name="?([^";]+)"?/;
my ($filename) = $disposition =~ /[; ]filename="?([^"]*)"?/;
next if ($upload && !defined $filename) || (!$upload && defined $filename);
if ($charset) {
$name = decode($charset, $name) // $name if $name;
$filename = decode($charset, $filename) // $filename if $filename;
}

# Check for file upload
my $value = $part;
unless (defined $filename) {
$value = $part->asset->slurp;
$value = decode($charset, $value) // $value if $charset;
unless ($upload) {
$part = $part->asset->slurp;
$part = decode($charset, $part) // $part if $charset;
}

push @formdata, [$name, $filename, $value];
push @formdata, [$name, $part, $filename];
}

return \@formdata;
Expand Down Expand Up @@ -382,7 +373,7 @@ Message content, defaults to a L<Mojo::Content::Single> object.
my $charset = $msg->default_charset;
$msg = $msg->default_charset('UTF-8');
Default charset used for form data parsing, defaults to C<UTF-8>.
Default charset used for form-data parsing, defaults to C<UTF-8>.
=head2 max_line_size
Expand Down Expand Up @@ -429,8 +420,8 @@ Slurp or replace C<content>.
POST parameters extracted from C<application/x-www-form-urlencoded> or
C<multipart/form-data> message body, usually a L<Mojo::Parameters> object.
Note that this method caches all data, so it should not be called before the
entire message body has been received. Also note that message content needs to
be loaded into memory to parse POST parameters, so you have to make sure it is
entire message body has been received. Parts of the message body need to be
loaded into memory to parse POST parameters, so you have to make sure it is
not excessively large.
# Get POST parameter value
Expand Down Expand Up @@ -486,9 +477,9 @@ Access message cookies. Meant to be overloaded in a subclass.
Turns message body into a L<Mojo::DOM> object and takes an optional selector
to perform a C<find> on it right away, which returns a L<Mojo::Collection>
object. Note that this method caches all data, so it should not be called
before the entire message body has been received. Also note that message
content needs to be loaded into memory to parse it, so you have to make sure
it is not excessively large.
before the entire message body has been received. The whole message body needs
to be loaded into memory to parse it, so you have to make sure it is not
excessively large.
# Perform "find" right away
say $msg->dom('h1, h2, h3')->pluck('text');
Expand Down Expand Up @@ -576,9 +567,9 @@ Check if message has exceeded C<max_line_size> or C<max_message_size>.
Decode JSON message body directly using L<Mojo::JSON> if possible, returns
C<undef> otherwise. An optional JSON Pointer can be used to extract a specific
value with L<Mojo::JSON::Pointer>. Note that this method caches all data, so
it should not be called before the entire message body has been received. Also
note that message content needs to be loaded into memory to parse it, so you
have to make sure it is not excessively large.
it should not be called before the entire message body has been received.
The whole message body needs to be loaded into memory to parse it, so you have
to make sure it is not excessively large.
# Extract JSON values
say $msg->json->{foo}{bar}[23];
Expand All @@ -591,9 +582,9 @@ have to make sure it is not excessively large.
my @foo = $msg->param('foo');
Access POST parameters. Note that this method caches all data, so it should
not be called before the entire message body has been received. Also note that
message content needs to be loaded into memory to parse POST parameters, so
you have to make sure it is not excessively large.
not be called before the entire message body has been received. Parts of the
message body need to be loaded into memory to parse POST parameters, so you
have to make sure it is not excessively large.
=head2 parse
Expand Down
12 changes: 6 additions & 6 deletions lib/Mojo/Message/Request.pm
Expand Up @@ -386,19 +386,19 @@ Check C<X-Requested-With> header for C<XMLHttpRequest> value.
my @foo = $req->param('foo');
Access GET and POST parameters. Note that this method caches all data, so it
should not be called before the entire request body has been received. Also
note that request content needs to be loaded into memory to parse POST
parameters, so you have to make sure it is not excessively large.
should not be called before the entire request body has been received. Parts
of the request body need to be loaded into memory to parse POST parameters, so
you have to make sure it is not excessively large.
=head2 params
my $params = $req->params;
All GET and POST parameters, usually a L<Mojo::Parameters> object. Note that
this method caches all data, so it should not be called before the entire
request body has been received. Also note that request content needs to be
loaded into memory to parse POST parameters, so you have to make sure it is
not excessively large.
request body has been received. Parts of the request body need to be loaded
into memory to parse POST parameters, so you have to make sure it is not
excessively large.
# Get parameter value
say $req->params->param('foo');
Expand Down
2 changes: 1 addition & 1 deletion lib/Mojo/UserAgent/Transactor.pm
Expand Up @@ -268,7 +268,7 @@ Mojo::UserAgent::Transactor - User agent transactor
# PATCH request with "Do Not Track" header and content
say $t->tx(PATCH => 'example.com' => {DNT => 1} => 'Hi!')->req->to_string;
# POST request with form data
# POST request with form-data
say $t->tx(POST => 'example.com' => form => {a => 'b'})->req->to_string;
# PUT request with JSON data
Expand Down
6 changes: 3 additions & 3 deletions lib/Mojolicious/Controller.pm
Expand Up @@ -601,9 +601,9 @@ L<Mojo::Transaction::WebSocket> object.
Access GET/POST parameters, file uploads and route placeholder values that are
not reserved stash values. Note that this method is context sensitive in some
cases and therefore needs to be used with care, there can always be multiple
values, which might have unexpected consequences. Also note that request
content needs to be loaded into memory to parse POST parameters, so you have
to make sure it is not excessively large.
values, which might have unexpected consequences. Parts of the request body
need to be loaded into memory to parse POST parameters, so you have to make
sure it is not excessively large.
# List context is ambiguous and should be avoided
my $hash = {foo => $self->param('foo')};
Expand Down

0 comments on commit 52253ce

Please sign in to comment.