Major performance degradation in crypto.createHash #5015
Comments
Just for reference - its 1000ms with https://github.com/indutny/node/compare/feature-improve-crypto-performance |
Okay, confirmed. v0.10 is on average 800% slower than v0.8 and it's because we spend roughly 10% total running time in Buffer::New() now. For comparison, in v0.8 that's only 0.1%. On top of that, because we create lots of buffers, the GC has to do a lot more work: 15-20% is spent in the GC as opposed to ~3.5% in v0.8. I think it's fair to say that the buffer-ication of the crypto API is not a resounding success. I move we revert it before v0.10.1. |
Reverting the stream API here would be pretty annoying, but losing the buffer-ization would be worse. Let's explore this in more detail than just reverting the whole thing to v0.8. If we can have it use buffers, but not do the stream API, then it's easy enough to move that bit to userland while we figure out a clearer API. This is a long-standing wart, and it'd be really annoying to just go back to the awful mess of an API that it was. |
Was there a reason to change the api in the first place? People that wanted I am +1 on removing all the new streams stuff and leaving things as they On Fri, Mar 15, 2013 at 10:45 AM, Isaac Z. Schlueter <
|
I will add that I am also in favor of not using the buffer class here too On Fri, Mar 15, 2013 at 12:04 PM, Roman Shtylman shtylman@gmail.com wrote:
|
@shtylman Crypto deals with bytes. It makes sense for it to deal with Buffers. |
If buffers can be just as fast, then OK. But I view the current perf loss
|
Is anything happening with this? |
The patch from @indutny just landed on the v0.10 branch, so I assume it'll be in the next v0.10 release. |
Actually, I just checked again and there is now a v0.10.1-release branch and the change log includes the patch, so there is a large speed up on v0.10.1. |
Thanks @dougwilson ! Looking forward to the v0.10.1 release. |
@ronkorving @3rd-Eden @dougwilson That patch from @indutny makes things much less-bad, but it's still about half the speed of v0.8. There'll be another patch in 0.10.2 to get us the rest of the way. Sorry for the inconvenience. |
To clarify, the problem isn't that we're using Buffers by default, but that we're converting strings to Buffers in JS-land instead of moving that code into C++-land. |
Um, I don't think that code does what you think it does. On Sun, Mar 24, 2013 at 5:19 PM, Amir Mamedov notifications@github.comwrote:
|
EDIT: Note that my Let's break down the benchmark (leaving out unecessaries): var time = process.hrtime();
// test 1
for (var i = 0; i < 1e6; i++)
crypto.createHash('md5').update(i + '');
// test 2
for (var i = 0; i < 1e6; i++)
crypto.createHash('md5').update(i + '').digest(); Results (@isaacs, know you'll love me for reporting them like this ;-):
As a quick fix for Not surprisingly, that hack doesn't help much in the case of The largest chunk of time in I'm reporting this just as an fyi to everyone asking why the performance drop, and to say I'm confident we can get it in the sub |
Yikes! This is not good for us heavy crypto users :-( |
Just want to give a quick update. Based on the work in #4964 here are the updated times:
|
@trevnorris What test code are you using for these latest numbers? |
@mscdex Using the test code above. Though I tweaked the number of iterations. |
@trevnorris Test 1 or test 2? |
Oh. Actually the test posted by @3rd-Eden. |
I'd like to clarify some of the original post. First, here are the tests ran (each test was run individually, and the result is the avg of several iterations): var iter = 1e7;
// test 1
var hash = crypto.createHash('md5');
var time = process.hrtime();
for (var i = 0; i < iter; i++)
hash.update(i + '');
// test 2
var time = process.hrtime();
for (var i = 0; i < iter; i++)
crypto.createHash('md5').update(i + '').digest();
time = process.hrtime(time);
console.log((time[0] * 1e9 + time[1]) / iter); Here are the times on my machine:
Though these are for strings. Let's look at updating with a 128 byte Buffer instead:
|
I've spent a good amount of time tracking this down yesterday and today. The vast majority of this regression is due to the fact that HashDigest now returns a buffer (of the SlowBuffer variety), instead of a binary string, and the Persistent/MakeWeak stuff is killing us. What's worse, if you do There are a few changes that would make this much better.
With these two changes, we out-perform v0.8 considerably. As an approach, I'm thinking:
Opinions? Bottom line: The regressed crypto stuff will be a bit faster if you use a string encoding (especially 'binary' or 'ascii'), but if you want to use Buffers, then they'll be a bit slower until we can make Buffers as fast as strings (maybe 0.12, maybe 1.0). |
The API looks fine but making it a template means you have to do this everywhere: if (encoding == FOO)
data = DataFromString<FOO>(string, &len);
else if (encoding == BAR)
data = DataFromString<BAR>(string, &len);
else if (encoding == QUUX)
/* and so on */ You probably want to make the encoding a function parameter rather than a template parameter. Another suggestion is to change the prototype to something like this: // Could also be done as a RAII class
void* DataFromString(Handle<String> string, enum EncodingEnum encoding, void* buf, size_t* len); If buf != NULL, you try to write to it (and *len should be the length), else you new[] the required memory. If the input string is small, you use an on-stack buffer to avoid incurring the overhead of new[] and delete[]. Mixing heap and non-heap pointers is error prone so it would make sense to hide the implementation details in a small class. |
@bnoordhuis Thanks, that's a good idea. I only suggested using a template function since that's how StreamWrap did it, but I agree that a small class is probably a better way to go. I'll whip something up soon. |
Work progresses on my crypto-fix-perf branch. Current benchmarks: https://gist.github.com/isaacs/5506763
Will finish up into a pull req for review tomorrow. |
👍 |
Update. Running the OP's code, get the following:
|
Is this still an ongoing effort? Are there still opportunities for improvement to explore? |
Yes and yes. There's still improvement that can be had, but since a large |
Any news on this? I recognized the node.js implementation of MD5 can't compete with the one from PHP. Only if I store |
@krnlde Instantiating a new Buffer object in 0.11 is now measurable in ns. So it has little to do w/ creating the Buffer itself. It's how the data is handled once ingested by crypto. It can still be improved. I'll assign this to myself and check it out on a rainy day. |
So it's the first ingest which takes a long time for each Hash object? Anyways, awesome! And thank you for your effort. |
Hey @trevnorris Any news on this? I read that in 0.12 there will be many improvements in the crypto lib. reference here |
@krnlde There's still improvements to be made. Though that's probably always the case. Additional performance improvements will be coming after the initial v0.12 release. Most of them will be safe to land in the v0.12 branch, so you can expect v0.12 to get progressively faster. |
@trevnorris ... was this pursued further? |
Running in v0.12 on osx...
Going to assume this is resolved. Closing. Can reopen if new information is presented. |
WTF. I don't even.. You run it on one version without comparing regression with the old node to see if its actually resolved. How does that logic even work?!
|
$ ~/Code/voxer/voxnode/node -e "console.log(process.version);console.time('hash');crypto = require('crypto');for (var i = 0; i < 100000; i++) crypto.createHash('md5').update(i + '').digest();console.timeEnd('hash')"
v0.8.26
hash: 161ms
./iojs -e "console.log(process.version);console.time('hash');crypto = require('crypto');for (var i = 0; i < 100000; i++) crypto.createHash('md5').update(i + '').digest();console.timeEnd('hash')"
v2.0.3
hash: 180ms
~/.node/0.12/bin/node -e "console.log(process.version);console.time('hash');crypto = require('crypto');for (var i = 0; i < 100000; i++) crypto.createHash('md5').update(i + '').digest();console.timeEnd('hash')"
v0.12.0
hash: 386ms So it is resolved in io.js and not resolved in node v0.12 |
Additionally we should really be using |
@mscdex @indutny ... do you see reason to keep this particular issue open then? The original issue showed a degradation between v0.8 and v0.10. While there are still improvements that definitely can be made, the converged stream will bring along the io.js improvements. Unless you think there's something we need to backport into v0.12, I'm inclined to defer to the converged stream. |
I don't see any point in having it open. |
If there are at least two people willing to do the back-porting work (one to back-port, one to review), it might be reasonable to keep the issue open. I'm going to guess however that no one is volunteering. |
A really simple snippet:
Node.js 0.8.22: 329ms
Node.js 0.10.0: 2291ms
These results were generated on Mac OSX 10.8.2
The text was updated successfully, but these errors were encountered: