Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Time.parse significantly slower than MRI 2.1.1 #1662

Closed
nirvdrum opened this issue Apr 25, 2014 · 7 comments
Closed

Time.parse significantly slower than MRI 2.1.1 #1662

nirvdrum opened this issue Apr 25, 2014 · 7 comments

Comments

@nirvdrum
Copy link
Contributor

In one app, we do a lot of calls like Time.parse('3PM EDT'). Not an ideal format, but it's what I have to work with. Parsing here is ~4.4x slower than MRI 2.1.1. Here's the simple benchmark code I'm using. I compared Time.parse to Time.strptime, a version that does app-specific parsing, and one that uses Joda:

require 'benchmark'
require 'time'

CYCLES = 100_000
TIME = '12PM EDT'.freeze
TIME_FORMAT = '%l%p %Z'.freeze

HOUR = '12PM'.freeze
ZONE = 'EDT'.freeze

if defined?(JRUBY_VERSION)
  JODA_FORMATTER = org.joda.time.format::DateTimeFormat.for_pattern("ha z")
end

def my_time(hour, zone)
  normalized_hour = hour.to_i
  normalized_hour += 12 if hour.end_with?('PM')

  Time.send(:make_time, nil, nil, nil, normalized_hour, nil, nil, nil, zone, Time.now)
end

def joda_time(time)
  d = JODA_FORMATTER.parseLocalTime(time).to_date_time_today
  DateTime.new!(d)
end

Benchmark.bmbm do |x|
  x.report('parse') { CYCLES.times { Time.parse(TIME) } }
  x.report('strptime') { CYCLES.times { Time.strptime(TIME, TIME_FORMAT) } }
  x.report('no parse') { CYCLES.times { my_time(HOUR, ZONE) } }
  x.report('joda') { CYCLES.times { joda_time(TIME) } } if defined?(JRUBY_VERSION)
end

JRuby 1.7.12:

Rehearsal --------------------------------------------
parse     10.080000   0.040000  10.120000 (  9.788000)
strptime  10.720000   0.020000  10.740000 ( 10.576000)
no parse   1.660000   0.000000   1.660000 (  1.619000)
joda       1.220000   0.010000   1.230000 (  1.031000)
---------------------------------- total: 23.750000sec

               user     system      total        real
parse     10.050000   0.020000  10.070000 (  9.931000)
strptime  11.170000   0.010000  11.180000 ( 10.892000)
no parse   1.690000   0.000000   1.690000 (  1.639000)
joda       1.210000   0.010000   1.220000 (  1.053000)

MRI 2.1.1p76:

Rehearsal --------------------------------------------
parse      2.290000   0.230000   2.520000 (  2.517842)
strptime   1.380000   0.220000   1.600000 (  1.596426)
no parse   1.010000   0.140000   1.150000 (  1.147728)
----------------------------------- total: 5.270000sec

               user     system      total        real
parse      2.350000   0.200000   2.550000 (  2.555987)
strptime   1.450000   0.190000   1.640000 (  1.636918)
no parse   0.970000   0.180000   1.150000 (  1.154804)
@nirvdrum
Copy link
Contributor Author

I should probably note that the joda code doesn't seem to take the timezone into account. I'm not overly familiar with its API, so I got more of a PoC going rather than something that'd be an exact replacement.

@atambo
Copy link
Member

atambo commented Apr 26, 2014

This might also be the underlying cause of jruby/activerecord-jdbc-adapter#540.

@headius
Copy link
Member

headius commented Jun 2, 2014

@chrisseaton The methods involved here are heavy Ruby code, running regexps and chopping up strings. Might be a good candidate for Truffle experimentation?

@headius
Copy link
Member

headius commented Jun 2, 2014

I did some profiling passes for Time.parse, mainly looking for heavy allocation. I came up with two patches: https://gist.github.com/headius/ca4c8f5c4f70c794950a

The first patch improves the performance of Hash#delete_if and #reject by passing arguments straight through if possible, rather than creating a temporary array every time. I have not checked that this passes tests, but I believe it should.

The second patch allows the "Bag" class's accessor methods to fill out, avoiding creation of multiple String instances on every access. I have not investigated whether too many methods will be defined using this patch.

The remaining big ticket items in an allocation profile were all related to the many little regexp matches going on. Unless there's a way for us to reuse a Joni ByteCodeMachine, these will remain expensive or have to change to non-regexp logic.

First hundred entries of allocation profile here: https://gist.github.com/headius/534f5ec7be7821d21e9a

Before and after (Time.parse part of @nirvdrum's bench only) numbers are improved, but still not good enough.

BEFORE:

system ~/projects/jruby $ jruby ../rubybench/time/bench_time_parse.rb
 18.760000   0.300000  19.060000 (  8.266000)
  5.220000   0.040000   5.260000 (  4.978000)
  5.120000   0.030000   5.150000 (  4.892000)
  4.970000   0.030000   5.000000 (  4.913000)
  4.940000   0.020000   4.960000 (  4.897000)

AFTER:

system ~/projects/jruby $ jruby ../rubybench/time/bench_time_parse.rb
 18.340000   0.320000  18.660000 (  7.784000)
  4.170000   0.040000   4.210000 (  3.979000)
  4.000000   0.030000   4.030000 (  3.898000)
  3.970000   0.020000   3.990000 (  3.911000)
  3.940000   0.020000   3.960000 (  3.891000)

I'll add the bench to jruby/rubybench.

@headius
Copy link
Member

headius commented Jun 2, 2014

MRI 2.1.1 numbers on my system, for comparison:

system ~/projects/jruby $ rvm ruby-2.1.1 do ruby ../rubybench/time/bench_time_parse.rb
  3.200000   0.010000   3.210000 (  3.215548)
  3.250000   0.010000   3.260000 (  3.257069)
  3.160000   0.000000   3.160000 (  3.163706)
  3.170000   0.000000   3.170000 (  3.168295)
  3.220000   0.010000   3.230000 (  3.230793)

@chrisseaton
Copy link
Contributor

Thanks for the idea for looking at this with Truffle - missed your comment @headius up until now. @djoooooe is currently completing our implementation of the core class Time, so we can look at #parse and the rest of the stdlib version of Time when that's working.

@kares
Copy link
Member

kares commented Jan 17, 2017

closing this one in favour of #3640 (same underlying issue)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants