Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Securerandom and KVM virtual machines #1896

Closed
trinode opened this issue Aug 10, 2014 · 12 comments
Closed

Securerandom and KVM virtual machines #1896

trinode opened this issue Aug 10, 2014 · 12 comments

Comments

@trinode
Copy link

trinode commented Aug 10, 2014

I've been using jRuby in development for a while now and now I've set up a virtual machine host here as the staging server, and a production server is hosted at Linode (I just mention that as it affects that server too and they know what they're doing...)

here's a profile I did trying to work out how come various things were taking forever, including starting my tests and actually starting the site in puma.

main profile results:
Total time: 384.40

     total        self    children       calls  method
----------------------------------------------------------------
    381.72        0.09      381.63        5692  Kernel.require
    379.42        0.21      379.21          54  Kernel.load
    378.98        0.14      378.84        9149  Array#each
    378.50        0.28      378.22         139  Kernel.eval
    377.02        0.00      377.02           1  Rake::Application#run
    377.02        0.00      377.02           3  Rake::Application#standard_exception_handling
    375.16        0.00      375.16        5620  Proc#call
    373.16        0.01      373.15         905  ActiveSupport::Dependencies::Loadable.load_dependency
    373.07        0.01      373.07         880  ActiveSupport::Dependencies::Loadable.require
    372.16        0.00      372.16           1  Rake::Application#top_level
    372.16        0.00      372.16           1  Rake::Application#run_with_threads
    372.16        0.00      372.16           1  Rake::Application#invoke_task
    372.16        0.00      372.16           3  Rake::Task#invoke
    372.16        0.00      372.16           7  Rake::Task#invoke_with_call_chain
    372.16        0.00      372.16          34  MonitorMixin.mon_synchronize
    372.16        0.00      372.16           5  Rake::Task#execute
    372.00        0.05      371.96        7498  Class#new
    371.95        0.00      371.95           7  TSort.tsort_each
    371.95        0.00      371.95           7  TSort.each_strongly_connected_component
    371.94        0.01      371.93         321  TSort.each_strongly_connected_component_from
    371.89        0.00      371.89           5  Rake::Task#invoke_prerequisites
    371.87        0.00      371.87           1  Rails::Application#require_environment!
    371.87        0.00      371.87           1  Rails::Application#initialize!
    371.87        0.00      371.87           1  Rails::Initializable.run_initializers
    371.83        0.02      371.81         102  BasicObject#instance_exec
    371.66        0.00      371.66          89  Rails::Initializable::Initializer#run
    370.71        0.01      370.70          51  BasicObject#instance_eval
    370.61        0.00      370.61          30  ActiveSupport.execute_hook
    370.58        0.00      370.58           7  ActiveSupport.run_load_hooks
    369.39        0.00      369.39          38  ActiveSupport.on_load
    369.35        0.00      369.35           1  Sprockets::Manifest#initialize
    369.35      369.35        0.00           2  SecureRandom.hex
      4.76        0.00        4.76           1  Rake::Application#load_rakefile
      4.76        0.00        4.76           1  Rake::Application#raw_load_rakefile
      4.76        0.00        4.76           1  Rake.load_rakefile
      4.10        0.00        4.10           1  Gem::ExecutableHooks.run
      3.53        0.00        3.53         106  Kernel.require
      2.88        0.00        2.88           1  Noexec#check
      2.88        0.00        2.88           1  Noexec#setup
      2.42        0.00        2.42           1  Noexec#candidate?
      1.87        0.02        1.84         144  Gem::Specification.load
      1.69        0.02        1.67           2  Bundler.load
      1.62        0.03        1.59           2  Bundler.definition
      1.55        0.09        1.45        1635  Array#map
      1.53        0.02        1.50           1  Bundler::Definition.build
      1.48        0.00        1.48           1  Bundler::Dsl.evaluate
      1.41        0.00        1.41           1  Bundler.require
      1.41        0.00        1.41           1  Bundler::Runtime#require
      1.32        0.00        1.32           2  Gem::Specification.each
      1.32        0.00        1.32           3  Gem::Specification._all

Over 3 minutes per call to securerandom (it turns out on Virtual machines - my CI server, my staging server (self hosted) and my production box (Linode hosted) that entropic data is low in terms of volume, you can cat /dev/random and it runs out of data pretty fast, /dev/urandom is not affected)
So I did some research and found I could force the JVM to use urandom
( -J-Djava.security.egd=/dev/urandom) the timing of securerandom becomes perfectly good.

one thing I noticed though, if it was left for a while the /dev/random buffer had enough entropy backed up to be able to satisfy securerandom immediately for the first few calls.

I use securerandom a fair amount for generating UUIDs so it's really crippling my app, I have the option workaround to use urandom for now, but I bet a lot of people won't dig that far into it, assuming their app (that can take up to 8 minutes to start in my case) is simply crashing on jRuby or that the VM they use hasn't got the horsepower for jRuby.

@nirvdrum
Copy link
Contributor

What version of JRuby are you running? 1.7.11 still starts by seeding from /dev/random, but massively reduces the number of times this is necessary. More details:

http://blog.mogotest.com/2014/03/11/faster-securerandom-in-jruby-1.7.11/

@trinode
Copy link
Author

trinode commented Aug 21, 2014

I've been using 1.7.12 I think the problem is only apparent when /dev/random is depleted, it takes quite some time for available randomness to get built up so if it runs out you can see the 2 calls to SecureRandom.hex took over 3 minutes each... /dev/random really is that slow to provide data on KVM VMs

@nirvdrum
Copy link
Contributor

In that case, I think your only real option is to override the seed source as you've found. While seeding from /dev/urandom is an option (and apparently is a fallback in MRI), I think it was determined to be too risky to make such a sweeping change in a point release. There was also some disagreement as how to interpret the Linux man page warning about /dev/urandom not being secure. @headius can probably provide a bit more insight there.

In the meanwhile, if the risk is appropriate for you, you do have a valid solution in front of you.

@trinode
Copy link
Author

trinode commented Aug 21, 2014

I'm okay with using this workaround.

If, like you said, there was a deliberate decision to avoid changing to urandom right now, then it may be worth mentioning it in the docs, to someone who's at a loss as to why their app takes 5 minutes to start or certain actions that they think are lightweight take an equally large time, I mean it cost me a few days trying to work out what I was doing wrong.

Hosting on Linode and other KVM hosts has got to be fairly popular these days so I would say it's a common use case for JRuby.

@nirvdrum
Copy link
Contributor

Ultimately, it's doing what Java would do. Arguably the host should ensure a source of entropy given how many applications there do use /dev/random. Having said that, adding it to the wiki sounds reasonable. (And FWIW, I'm not on the dev team . . . just a guy that spent a lot of time trying to speed up Rails in JRuby).

@nirvdrum
Copy link
Contributor

Put another way, if /dev/random is causing you to block, someone else on that machine must be aggressively reading it. Sounds like a good DoS vector. I'd just think the host would have a way to guard against this, like with CPU steal.

@trinode
Copy link
Author

trinode commented Aug 21, 2014

You almost had me for a minute there! ;)

KVM doesn't distribute the host's /dev/random data amongst the Guest VMs, they have to make their own (which is a lot harder when you've not got physical devices with the associated noise). I validated this with my own KVM host, shut down all the other guests and there was virtually no data coming through /dev/random, but logging in to the host, there was plenty.

@pjlegato
Copy link

@nirvdrum , a lot of headless virtual servers don't have keyboards or mice and only rarely get network packets, which often leads to a depleted entropy pool even under light use.

There seems to be some controversy over whether /dev/urandom is an acceptable substitute for /dev/random in such cases. If /dev/urandom is not an option for whatever reason, there's a daemon called haveged that adds entropy to the pool by re-running the same code many times and drawing entropy from the minute variations in execution time. It can't hurt to have more entropy, in any case.

More info: https://www.digitalocean.com/community/tutorials/how-to-setup-additional-entropy-for-cloud-servers-using-haveged

@headius
Copy link
Member

headius commented May 5, 2015

I never noticed this bug, but we had another user discover that an entropy server helped their startup time. This is fascinating!

I will be adding an entry to the wiki page on startup time. @trinode how much did your times improve once you set up an entropy server?

@trinode
Copy link
Author

trinode commented May 5, 2015

@headius I managed to make do with using /dev/urandom, I never used an entropy server, but it shaved that 6 minute delay from the startup.

I've had to go back to MRI for now (needing things like activerecord-postgis-adapter and other things which seem to not support JRuby in a timely fashion.

@digitalextremist
Copy link
Contributor

I've experienced this many times with Celluloid trying to generate unique UUID's, and installing haveged solved the issue:

Under Debian-like Linux flavors like Mint and Ubuntu, this gets that done:

sudo apt-get install haveged

@kares
Copy link
Member

kares commented Jun 22, 2017

JRuby prefers the non-blocking variant when available + SecureRandom has been updated to re-use the same path for generating random values (in 9.1.11/12) ... closing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants