Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SecureRandom performance could be better #1405

Closed
headius opened this issue Jan 15, 2014 · 31 comments
Closed

SecureRandom performance could be better #1405

headius opened this issue Jan 15, 2014 · 31 comments

Comments

@headius
Copy link
Member

headius commented Jan 15, 2014

Our SecureRandom in 1.7.10 (from securerandom.rb) was implemented using Java integration calling out to java.security.SecureRandom. It constructed a new instance each time.

In 09b169a, I moved the meat of the logic (#random_bytes) into Java code, which roughly doubled performance on my machine, bringing it close to MRI.

@bbrowning and @nirvdrum discovered a JVM property that, when passed, sets SecureRandom to use /dev/urandom or /dev/random, which improves performance further: java.security.egd=/dev/random. On my system, this again doubles performance, but the format appears to vary across JVMs.

Finally, I made a modification that constructs a SecureRandom per thread, avoiding synchronization contention, and started to move other methods into Java starting with #hex. This brought about the best performance so far, but I have not committed this change because I'm unsure about saving the SecureRandom per thread and whether that does something "bad" as far as the security of its results.

We need to evaluate all these options and come up with a good balance of security and performance, since it seems that SecureRandom is hit very heavily by frameworks like Rails.

Performance with per-thread SecureRandom and the Java property across JVMs on OS X: https://gist.github.com/headius/8428930

Patch to use a SecureRandom per thread and make #hex native: https://gist.github.com/headius/8429206

@headius
Copy link
Member Author

headius commented Jan 15, 2014

FWIW, the final results in that gist are (I believe) as much as 10x faster than they used to be.

@headius
Copy link
Member Author

headius commented Jan 15, 2014

An additional discovery: It appears that the JDK spins up a thread to seed the random number generator (once) for each new thread that requests random data. This appears to happen with a new SecureRandom each time, a single shared SecureRandom, or a SecureRandom created once per thread. It may be unavoidable, but it should also be hidden from us and not matter a great deal unless people are spinning up a lot of new short-lived threads and requesting random data in each.

@nirvdrum
Copy link
Contributor

@headius Another option is to specify the PRNG engine when creating the SecureRandom instance. @bbrowning reportedly saw the same performance increase using SecureRandom.getInstance("SHA1PRNG") as he did when setting the System property. That would allow us to support pluggable engines, much like MRI's SecureRandom purports to do. It looks like SHA1PRNG will hit /dev/random to seed and then just SHAs values thereafter.

@headius
Copy link
Member Author

headius commented Jan 15, 2014

Including @emboss.

@tarcieri
Copy link

I can circulate this PR around for comments

@headius
Copy link
Member Author

headius commented Jan 16, 2014

@tarcieri That would be great. We're having trouble getting any consensus, even from folks we might consider experts in this area.

@tarcieri
Copy link

If you're using Java's built-in SecureRandom once per thread, and they're all just backed by /dev/urandom, I think you should be totally fine.

@headius
Copy link
Member Author

headius commented Jan 16, 2014

@tarcieri They're all backed by /dev/random at the beginning, but some continue to reseed from /dev/random and others are just algorithmic after that point. They can be configured to use urandom, which definitely speeds up throughput at the cost of good entropy.

MRI appears to try to use OpenSSL's RAND functions, and if that's not available it uses urandom or the equivalent calls on Windows. I do not know how RAND compares to random/urandom.

I'm going to go with a commit that uses per-thread SecureRandom for now and keep researching.

@tarcieri
Copy link

I wouldn't place too much qualitative value on numbers coming from /dev/random vs /dev/urandom. The main difference is /dev/random is allowed to block. On many platforms they're the same.

Most security-conscious software I know of uses /dev/urandom

@bbrowning
Copy link
Contributor

It would be nice if the JVM defaulted to using /dev/urandom instead of /dev/random. As it stands, the only way to use /dev/urandom is to set a JVM system property, which makes for an inconvenient out-of-the-box experience. My vote is to continue using the JVM default, but switching to SHA1PRNG instead of NativePRNG so that /dev/random is only accessed to seed each SecureRandom instance.

Then we have to decide how often to create SecureRandom instances? Is one per thread enough? Or should we create one that's valid for N uses, then throw that one away and create a new one for the next N uses? Or some combination of the above? With a default Rails app this gets hit on every request, so we want to find the right balance between security and performance.

@headius
Copy link
Member Author

headius commented Jan 17, 2014

My concern about hardcoding SHA1PRNG is that it never reseeds unless we reseed it. At least /dev/urandom will try to maintain a pool of entropy as it runs. I worry that hardcoding SHA1PRNG could lead to an attacker being able to guess random numbers over time.

I've tried to figure out how to get the JDK to pick the faster urandom source without specifying the property, but I have been unsuccessful. I will take another look.

@tarcieri
Copy link

You will probably get the best security properties by reading from /dev/urandom directly if it's available, rather than using it to seed another PRNG (see e.g. what happened to Android)

Trusting the OS's PRNG is probably your best bet.

@headius
Copy link
Member Author

headius commented Jan 17, 2014

Ok, I think I have finally figured out how OpenJDK sets up these PRNGs.

By default, on all systems I tested, it's using a native PRNG that always reads from /dev/random. This is not the fastest one, since it has extra levels of synchronization and buffering overhead to read from a globally-shared FileInputStream.

If you specify the /dev/random device in the egd property, or request the SHA1PRNG algorithm, a different algorithm is used that initially seeds from /dev/random and then uses an algorithmic RNG based on repeated SHA1 hashing of internal state. This appears to be the fastest option, but it only seeds once. And If I'm reading the code right, it seeds a global SHA1PRNG once, and then uses bytes from that SHA1PRNG as the seed for all future instances. HOWEVER, it does appear that if we request seed bytes from SecureRandom, it always goes back to the engine, which in this case will go to the native /dev/random device. See sun.security.provider.SecureRandom, sun.security.provider.SeedGenerator.

If you specify the /dev/urandom devices in the egd property, most systems appear to use a NativePRNG that uses /dev/urandom. This appears to be a bit faster than the /dev/random NativePRNG, but not as fast as the SHA1PRNG. I have not been able to find a way to select this PRNG from within an already-running JVM.

For the NativePRNG logic, see sun.security.provider.NativePRNG under the solaris src tree in OpenJDK.

If you specify a device that does not exist, OpenJDK uses a SHA1PRNG that uses a background thread to generate entropy. You can see these threads running by passing --sample to JRuby or -Xprof to the JVM.

Interestingly, the Android fix appears to go all the way to forcing /dev/urandom for all random bytes. I'm not sure if they did that because it didn't seed at all, or because it was similar to the OpenJDK impl in that it only seeded once.

Given this analysis, I think this is where we stand:

  • Specifying SHA1PRNG alone is not enough. We would at least wan to ensure it gets reseeded at creation from a known native RNG. Even then, unless we're reseeding, it's a purely algorithmic RNG.
  • Both NativePRNG will use /dev/*random devices, but it's unclear whether this is necessary to match MRI. MRI uses OpenSSL's RAND logic by default, which appears (from documentation) to be initially seeded but algorithmic thereafter. This needs more research.

If it's acceptable for MRI to use an algorithmic RNG by default, it may be acceptable for us to do the same. But if they are at risk in that case, we would not do well to follow them.

@tarcieri
Copy link

If you'd like to test your random number generation empirically, you could use a tool like Ent:

http://www.fourmilab.ch/random/

I'd suggest generating a LOT (tens of gigabytes+) of random data on a variety of platforms though. Ent will, in theory, tell you if you're getting good random numbers.

@headius
Copy link
Member Author

headius commented Jan 17, 2014

Update on performance changes to date.

I have made the following changes, which are all now active on jruby-1_7:

  • Per-thread instance of SecureRandom, to avoid at least a little contention.
  • The random_bytes, hex, and uuid methods are now implemented natively, since they're hard-hit by frameworks like Rails.

I have not done any change to the default PRNG provider, so it should still be using /dev/random for all data on most platforms.

Given these changes, our hex and uuid methods now appear to be the fastest of the production Ruby impls. In addition to performing better, this should ideally have less contention, but I have not measured that yet. There's synchronization at various levels of the OpenJDK SecureRandom impl, so we may just hit a different one.

jruby-1_7 numbers are at the top.

system ~/projects/jruby-1.7 $ jruby -rsecurerandom -rbenchmark -e "10.times { puts Benchmark.measure { 100_000.times { SecureRandom.hex } } }"
  1.090000   0.190000   1.280000 (  0.551000)
  0.150000   0.170000   0.320000 (  0.287000)
  0.130000   0.170000   0.300000 (  0.282000)
  0.120000   0.160000   0.280000 (  0.280000)
  0.110000   0.160000   0.270000 (  0.267000)
  0.110000   0.160000   0.270000 (  0.270000)
  0.100000   0.160000   0.260000 (  0.264000)
  0.110000   0.160000   0.270000 (  0.274000)
  0.110000   0.160000   0.270000 (  0.275000)
  0.120000   0.160000   0.280000 (  0.272000)

system ~/projects/jruby-1.7 $ jruby -rsecurerandom -rbenchmark -e "10.times { puts Benchmark.measure { 100_000.times { SecureRandom.uuid } } }"
  1.120000   0.190000   1.310000 (  0.566000)
  0.140000   0.170000   0.310000 (  0.276000)
  0.130000   0.160000   0.290000 (  0.277000)
  0.120000   0.160000   0.280000 (  0.282000)
  0.110000   0.170000   0.280000 (  0.278000)
  0.120000   0.160000   0.280000 (  0.276000)
  0.110000   0.170000   0.280000 (  0.272000)
  0.120000   0.160000   0.280000 (  0.275000)
  0.110000   0.160000   0.270000 (  0.267000)
  0.110000   0.160000   0.270000 (  0.270000)

system ~/projects/jruby-1.7 $ rvm jruby-1.7.10 do jruby -rsecurerandom -rbenchmark -e "10.times { puts Benchmark.measure { 100_000.times { SecureRandom.hex } } }"
  3.600000   0.250000   3.850000 (  1.938000)
  0.730000   0.210000   0.940000 (  0.897000)
  0.700000   0.210000   0.910000 (  0.882000)
  0.680000   0.210000   0.890000 (  0.871000)
  0.700000   0.220000   0.920000 (  0.902000)
  0.670000   0.210000   0.880000 (  0.870000)
  0.680000   0.200000   0.880000 (  0.868000)
  0.690000   0.220000   0.910000 (  0.888000)
  0.680000   0.200000   0.880000 (  0.871000)
  0.670000   0.210000   0.880000 (  0.862000)

system ~/projects/jruby-1.7 $ rvm jruby-1.7.10 do jruby -rsecurerandom -rbenchmark -e "10.times { puts Benchmark.measure { 100_000.times { SecureRandom.uuid } } }"
  4.370000   0.270000   4.640000 (  2.391000)
  0.910000   0.220000   1.130000 (  1.082000)
  0.940000   0.220000   1.160000 (  1.095000)
  0.890000   0.210000   1.100000 (  1.081000)
  0.890000   0.220000   1.110000 (  1.076000)
  0.860000   0.210000   1.070000 (  1.053000)
  0.880000   0.220000   1.100000 (  1.069000)
  0.860000   0.210000   1.070000 (  1.052000)
  0.880000   0.220000   1.100000 (  1.067000)
  0.890000   0.220000   1.110000 (  1.077000)

system ~/projects/jruby-1.7 $ rvm ruby-2.1 do ruby -rsecurerandom -rbenchmark -e "10.times { puts Benchmark.measure { 100_000.times { SecureRandom.hex } } }"
  0.340000   0.000000   0.340000 (  0.341292)
  0.330000   0.000000   0.330000 (  0.328537)
  0.340000   0.000000   0.340000 (  0.337264)
  0.320000   0.000000   0.320000 (  0.327896)
  0.330000   0.000000   0.330000 (  0.333035)
  0.320000   0.000000   0.320000 (  0.326834)
  0.330000   0.000000   0.330000 (  0.327988)
  0.330000   0.000000   0.330000 (  0.325909)
  0.330000   0.000000   0.330000 (  0.333740)
  0.350000   0.000000   0.350000 (  0.346400)

system ~/projects/jruby-1.7 $ rvm ruby-2.1 do ruby -rsecurerandom -rbenchmark -e "10.times { puts Benchmark.measure { 100_000.times { SecureRandom.uuid } } }"
  0.810000   0.000000   0.810000 (  0.814548)
  0.820000   0.000000   0.820000 (  0.818537)
  0.790000   0.000000   0.790000 (  0.793935)
  0.800000   0.000000   0.800000 (  0.803840)
  0.800000   0.000000   0.800000 (  0.799368)
  0.800000   0.000000   0.800000 (  0.796820)
  0.800000   0.000000   0.800000 (  0.797602)
  0.800000   0.000000   0.800000 (  0.806064)
  0.800000   0.000000   0.800000 (  0.797657)
  0.810000   0.000000   0.810000 (  0.810454)

system ~/projects/jruby-1.7 $ ../rubinius/bin/rbx -rsecurerandom -rbenchmark -e "10.times { puts Benchmark.measure { 100_000.times { SecureRandom.hex } } }"
  1.525987   0.014703   1.540690 (  1.166196)
  1.041410   0.002650   1.044060 (  1.021311)
  0.976423   0.002144   0.978567 (  0.965685)
  0.984604   0.002316   0.986920 (  0.969260)
  0.988570   0.002427   0.990997 (  0.971793)
  0.997175   0.002571   0.999746 (  0.986981)
  0.982239   0.002584   0.984823 (  0.972383)
  0.991014   0.002301   0.993315 (  0.969333)
  0.989509   0.002349   0.991858 (  0.981449)
  0.990885   0.002232   0.993117 (  0.980794)

system ~/projects/jruby-1.7 $ ../rubinius/bin/rbx -rsecurerandom -rbenchmark -e "10.times { puts Benchmark.measure { 100_000.times { SecureRandom.uuid } } }"
  3.133920   0.021672   3.155592 (  2.401616)
  2.095951   0.006097   2.102048 (  2.057178)
  1.985541   0.005076   1.990617 (  1.976595)
  2.006290   0.005428   2.011718 (  1.998209)
  1.987184   0.004899   1.992083 (  1.977983)
  1.998369   0.005171   2.003540 (  1.989894)
  1.980422   0.004878   1.985300 (  1.971663)
  2.002532   0.004869   2.007401 (  1.993409)
  1.996810   0.005239   2.002049 (  1.988065)
  1.997692   0.005137   2.002829 (  1.989080)

@headius
Copy link
Member Author

headius commented Jan 17, 2014

@tarcieri I gave ent a try with the sha1prng, and the numbers are somewhat surprising. It appears to be nearly as good as the random devices.

My /dev/random largely matched the Chi Squared on the ent site, so I won't post them. But here's the numbers for SHA1PRNG explicitly requested at SecureRandom construction time.

system ~/projects/jruby-1.7 $ ent sha1prng.bin 
Entropy = 7.999981 bits per byte.

Optimum compression would reduce the size
of this 10000000 byte file by 0 percent.

Chi square distribution for 10000000 samples is 261.04, and randomly
would exceed this value 38.40 percent of the times.

Arithmetic mean value of data bytes is 127.5170 (127.5 = random).
Monte Carlo value for Pi is 3.142021257 (error 0.01 percent).
Serial correlation coefficient is -0.000204 (totally uncorrelated = 0.0).

system ~/projects/jruby-1.7 $ ent sha1prng.bin 
Entropy = 7.999981 bits per byte.

Optimum compression would reduce the size
of this 10000000 byte file by 0 percent.

Chi square distribution for 10000000 samples is 261.04, and randomly
would exceed this value 38.40 percent of the times.

Arithmetic mean value of data bytes is 127.5170 (127.5 = random).
Monte Carlo value for Pi is 3.142021257 (error 0.01 percent).
Serial correlation coefficient is -0.000204 (totally uncorrelated = 0.0).

system ~/projects/jruby-1.7 $ jruby -rsecurerandom -e "File.open('sha1prng.bin', 'w') {|f| 10000.times { f.write(SecureRandom.random_bytes(100_000)) } }"

system ~/projects/jruby-1.7 $ ent sha1prng.bin 
Entropy = 8.000000 bits per byte.

Optimum compression would reduce the size
of this 1000000000 byte file by 0 percent.

Chi square distribution for 1000000000 samples is 215.93, and randomly
would exceed this value 96.39 percent of the times.

Arithmetic mean value of data bytes is 127.5018 (127.5 = random).
Monte Carlo value for Pi is 3.141563293 (error 0.00 percent).
Serial correlation coefficient is 0.000070 (totally uncorrelated = 0.0).

system ~/projects/jruby-1.7 $ jruby -rsecurerandom -e "File.open('sha1prng.bin', 'w') {|f| 10000.times { f.write(SecureRandom.random_bytes(100_000)) } }"

system ~/projects/jruby-1.7 $ ent sha1prng.bin 
Entropy = 8.000000 bits per byte.

Optimum compression would reduce the size
of this 1000000000 byte file by 0 percent.

Chi square distribution for 1000000000 samples is 230.02, and randomly
would exceed this value 86.75 percent of the times.

Arithmetic mean value of data bytes is 127.5001 (127.5 = random).
Monte Carlo value for Pi is 3.141387061 (error 0.01 percent).
Serial correlation coefficient is 0.000018 (totally uncorrelated = 0.0).

This is just one metric, but it's starting to look like OpenJDK's SHA1PRNG isn't too bad.

@headius
Copy link
Member Author

headius commented Aug 1, 2016

Ok, waking this one up again...

There are still outstanding reports of blocking or slow behavior using SecureRandom, like nats-io/nats.rb#123 and this comment by @cheald.

It seems we're still not using whatever is equivalent to MRI, since as far as I know MRI does not get reports of entropy-starved blocking behaviors.

So...anyone still seeing this problem that can help us explore solutions? @cheald @trinode @digitalextremist @tarcieri @wallyqs

@cheald
Copy link
Contributor

cheald commented Aug 1, 2016

I haven't seen it lately since I've added entropy daemons to my standard saltstack on new machines, but it might be possible to replicate it on a fresh VM. I'll run some experiments on AWS VMs.

@cheald
Copy link
Contributor

cheald commented Aug 1, 2016

I'm unable to replicate the issue in a fresh VM here. Amazon Linux, no entropy daemon. I'm keeping the entropy pool drained:

# cat /dev/random > /dev/null &
# cat /proc/sys/kernel/random/entropy_avail
4

Then installed JRuby and benchmarked SecureRandom:

jruby-9.1.2.0 :009 > File.read("/proc/sys/kernel/random/entropy_avail").strip
 => "0"
jruby-9.1.2.0 :010 > Benchmark.bm {|x| x.report("uuid") { 1_000_000.times { SecureRandom.uuid } } }
       user     system      total        real
uuid  0.480000   0.020000   0.500000 (  0.507049)

Despite the system being entropy-starved, it had no problem creating UUIDs. My previous issue may have been something other than SecureRandom. I don't recall the circumstances other than trying to re-establish my dev environment after a fresh OS install.

@headius
Copy link
Member Author

headius commented Aug 1, 2016

@cheald Ok, thanks for following up. Hopefully if there's still an issue, someone else can reproduce it for us. Of course, I hope there's no longer still an issue :-)

@tarcieri
Copy link

tarcieri commented Aug 1, 2016

For what it's worth, here's what I netted out on for JRuby support in the sysrandom gem:

https://github.com/cryptosphere/sysrandom/blob/master/lib/sysrandom.rb#L17

@wallyqs
Copy link

wallyqs commented Aug 1, 2016

Running an entropy daemon (haveged) as @cheald suggests does seem to reduce the time a lot locally and mitigate the issue. Issue seems to be more pronounced though in a Travis build environment without it for example at least for me, where calling something like SecureRandom.hex(13) the first time can take several minutes (subsequents being ok...): https://travis-ci.org/wallyqs/ruby-nats/builds/148795654

@headius
Copy link
Member Author

headius commented Aug 5, 2016

We had another report of problems today.

I have pushed to master a config property jruby.preferred.prng that can be used to select from the available PRNG in the JDK. On my system (OS X, OracleJDK 8u92) and on the reporter's Linux, the available PRNG are NativePRNG, SHA1PRNG, NativePRNGBlocking, NativePRNGNonBlocking. By default, we try to use SHA1PRNG since we believed it consumes the least entropy.

To get that list on your system, run this:

Java::sun.security.jca.Providers.getProviderList.providers.each do |p|
  p.services.each do |s|
    puts s.algorithm if s.type == "SecureRandom"
  end
end

@headius
Copy link
Member Author

headius commented Aug 5, 2016

Note that NativePRNGNonBlocking is new in Java 8. That's what's being used for @tarcieri's workaround above.

@tarcieri That is working well for you?

@headius
Copy link
Member Author

headius commented Aug 5, 2016

I didn't really show how to use the property... it's -Djruby.preferred.prng=<NAME> passed to Java or -Xpreferred.prng=<NAME> passed to the jruby command.

I've kicked off a build for 9.1.3.0-SNAPSHOT on http://ci.jruby.org. Should be ready soon.

@tarcieri
Copy link

tarcieri commented Aug 5, 2016

Haven't had complaints yet, but I'm curious how it would work in some of the scenarios where people are having problems.

@headius
Copy link
Member Author

headius commented Aug 5, 2016

Given the reports we're getting about SecureRandom blocking for sometimes many seconds, I've done another commit to try the following providers in turn:

  1. jruby.preferred.prng, which now defaults to NativePRNGNonBlocking
  2. SHA1PRNG
  3. Whatever JDK would have picked by default

I've kicked off another build for http://ci.jruby.org.

@headius headius added this to the JRuby 9.1.3.0 milestone Aug 5, 2016
@headius
Copy link
Member Author

headius commented Aug 5, 2016

Another discussion of SecureRandom: https://tersesystems.com/2015/12/17/the-right-way-to-use-securerandom/

@y8
Copy link

y8 commented Aug 9, 2016

@tarcieri we are running absolutely everything in Docker. Few months ago we've spotted that our CI builds are starting to increase on busy hours. Initially we thought that it's caused by memory constraints on build agents, but couple days ago we suddenly hit an issue when our Rails apps are failed to boot in production in 5 minutes initialization window defined by healthcheck. We were unable to reproduce this problem on developer machines, so it was clear that something wrong with a deployment environment.

After investigation I found out that boot process stuck on this call:

  def generate_manifest_path
      ".sprockets-manifest-#{SecureRandom.hex(16)}.json"
  end

We checked metrics from Prometheus node_exporter and found out that our entropy pool is completely dry ;)

vnk

We've deployed haveged on all hosts (you can clearly see that moment on the graph) and suddenly our pull-request build time went from 10-15 min to just 2-5 min and full build-time from 30 min to 10 min:

vod

I have no idea who is responsible for the pool exhaustion, I suspect that might have something to do with Ranchers' IPSec networking, because it's the only common thing on all hosts, and you can see that we have pretty much the same exhaustion pattern on all hosts, even they don't run anything jruby-ish.

So no wonder that @headius heard stories about 30 minutes rails boot times :)

@headius
Copy link
Member Author

headius commented Aug 9, 2016

@y8 If you are able to test my change in production, hopefully the nonblocking PRNG will be the solution to all our problems :-) We are starting to stabilize for 9.1.3.0, so let us know when you're able to confirm.

@headius
Copy link
Member Author

headius commented Aug 15, 2016

Well I'm going to optimistically call this good. On Java 8, we will attempt to use the NativePRNGNonBlocking. If that's not available, or on other platforms, we will try to use the SHA1PRNG. After that we just let the JDK decide.

If someone's still able to see entropy starvation with JRuby 9.1.3.0 (master right now), please let us know ASAP.

@headius headius closed this as completed Aug 15, 2016
headius added a commit to headius/jruby that referenced this issue Sep 12, 2016
headius added a commit to headius/jruby that referenced this issue Sep 12, 2016
I botched the previous patch a bit by unconditionally re-assigning
the secureRandom local to a default JDK new SecureRandom. This
could cause systems without the default preferred PRNG
(NativePRNGNonBlocking, Java 8+) to have slower thread startup and/or
random number generation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants