Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve error messages for InvalidByteSequenceError #2814

Merged
merged 1 commit into from
Jun 12, 2016

Conversation

jhass
Copy link
Member

@jhass jhass commented Jun 12, 2016

No description provided.

@asterite
Copy link
Member

It's a good idea, but I'm not sure we should do this. Try this benchmark:

require "benchmark"

io = MemoryIO.new
io << "foo bar"
io << "今日"
string = io.to_s

time = Time.now
a = 0
20_000_000.times do
  string.each_char do |char|
    a += char.ord
  end
end
puts a
puts Time.now - time

Before: 00:00:01.7733200
After: 00:00:02.1635580

Basically, doing string interpolation and more work, even if an exception is not raised, makes the optimizer not inline some things and generate more, slower code.

That's also the reason why Array#[] just raises IndexError without too much information.

@jhass
Copy link
Member Author

jhass commented Jun 12, 2016

Not being helpful is a terrible solution though. What if we make Exception.new take a block?

@asterite
Copy link
Member

Hm, in fact, I tried adding @[AlwaysInline] in a few places in Char::Reader and the times improve. I get 0.85s now. However, without this change and also using @[AlwaysInline] I get 0.41s, so raising with more info is twice as slow. I'm not sure what should we do.

@jhass
Copy link
Member Author

jhass commented Jun 12, 2016

Actually I'm not sure I even understand why it's slower, I mean unless the error condition is actually reached the code shouldn't run?

@asterite
Copy link
Member

Never mind. With @[AlwaysInline] I now get the same times. It was slow because I also had the change of chr and unsafe_chr. Using unsafe_chr brings the performance back.

@jhass
Copy link
Member Author

jhass commented Jun 12, 2016

Cool so merge and add the @[AlwaysInline] annotations afterwards? :)

And we actually can make Array#[] more verbose too?

@asterite
Copy link
Member

You can try, yes, but do some benchmarks to see if the performance doesn't degrade.

@asterite
Copy link
Member

Let's merge this, then I'll rebase and merge #2816, and then add the attribute.

@asterite asterite merged commit 5626e12 into crystal-lang:master Jun 12, 2016
@jhass
Copy link
Member Author

jhass commented Jun 12, 2016

require "benchmark"

class Array
  @[AlwaysInline]
  def at_detail(index : Int)
    at(index) { raise IndexError.new("Index out of bounds: #{index} is bigger than #{size-1}") }
  end
end


a = [1, 2, 3]

Benchmark.ips do |x|
  x.report("Array#at(1)") { a.at(1) }
  x.report("Array#at_detail(1)") { a.at_detail(1) }
  x.report("Array#at(4)") { a.at(4) rescue nil }
  x.report("Array#at_detail(4)") { a.at_detail(4) rescue nil }
end
       Array#at(1) 219.24M (± 7.00%)    1.02× slower
Array#at_detail(1) 223.84M (±10.52%)         fastest
       Array#at(4) 132.95k (± 4.97%) 1683.59× slower
Array#at_detail(4) 121.14k (± 9.93%) 1847.74× slower

No significant difference in the average case, the verbose version is even a bit faster in this benchmark arrangement due to the better cache locality, the results are swapped if the order of the benchmark is swapped, all within error margin anyway.

Do I still have to benchmark the other IndexError cases in array? :P

@asterite
Copy link
Member

Looks really good! :-) 👍

@jhass jhass deleted the better_utf8_errors branch June 16, 2016 23:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants