Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'bad_record_mac' when uploading things through SSL under particular conditions #1331

Closed
simonask opened this issue Dec 12, 2013 · 19 comments
Closed

Comments

@simonask
Copy link

JRuby version:
jruby 1.7.9 (1.9.3p392) 2013-12-06 87b108a on Java HotSpot(TM) 64-Bit Server VM 1.7.0_45-b18 [darwin-x86_64]

I'm occasionally getting an SSLError with the message "bad_record_mac" when trying to upload files to S3 through Fog. Since the error seems to happen for other people using other libraries (see rubygems/rubygems.org#615, for instance), it seems likely to be a bug in JRuby or even the JRE.

The issue seems to crop up when the data size goes beyond a certain limit — in my case, I could make it "work" by restricting file size to <100K, which obviously isn't a very good long-term solution. There is more info in the comment thread pertaining to the issue in RubyGems I referenced above.

Running JRuby with JRUBY_OPTS="-J-Djavax.net.debug=ssl" yielded a lot of output. I have tried to limit it to the lines leading up to the failure. Note that handshake seems to go fine, but the error happens during the transmission itself:

***
main, WRITE: TLSv1 Handshake, length = 48
main, READ: TLSv1 Change Cipher Spec, length = 1
main, READ: TLSv1 Handshake, length = 48
*** Finished
verify_data:  { 164, 218, 115, 72, 131, 182, 208, 68, 120, 45, 38, 218 }
***
%% Cached client session: [Session-2, TLS_RSA_WITH_AES_128_CBC_SHA]
main, WRITE: TLSv1 Application Data, length = 382
main, WRITE: TLSv1 Application Data, length = 1
main, WRITE: TLSv1 Application Data, length = 15846
main, WRITE: TLSv1 Application Data, length = 1
main, WRITE: TLSv1 Application Data, length = 1
main, WRITE: TLSv1 Application Data, length = 15846
main, WRITE: TLSv1 Application Data, length = 1
main, WRITE: TLSv1 Application Data, length = 15846
main, WRITE: TLSv1 Application Data, length = 1
main, WRITE: TLSv1 Application Data, length = 15846
main, WRITE: TLSv1 Application Data, length = 1
main, WRITE: TLSv1 Application Data, length = 15846
main, WRITE: TLSv1 Application Data, length = 1
main, WRITE: TLSv1 Application Data, length = 15846
main, WRITE: TLSv1 Application Data, length = 1
main, WRITE: TLSv1 Application Data, length = 15846
main, WRITE: TLSv1 Application Data, length = 1
main, WRITE: TLSv1 Application Data, length = 15846
main, WRITE: TLSv1 Application Data, length = 1
main, WRITE: TLSv1 Application Data, length = 15846
main, WRITE: TLSv1 Application Data, length = 1
main, WRITE: TLSv1 Application Data, length = 15846
main, WRITE: TLSv1 Application Data, length = 1
main, WRITE: TLSv1 Application Data, length = 15846
main, WRITE: TLSv1 Application Data, length = 1
main, WRITE: TLSv1 Application Data, length = 15846
main, WRITE: TLSv1 Application Data, length = 1
main, WRITE: TLSv1 Application Data, length = 15846
main, WRITE: TLSv1 Application Data, length = 1
main, WRITE: TLSv1 Application Data, length = 15846
main, WRITE: TLSv1 Application Data, length = 1
main, WRITE: TLSv1 Application Data, length = 15846
main, WRITE: TLSv1 Application Data, length = 1
main, WRITE: TLSv1 Application Data, length = 15846
main, WRITE: TLSv1 Application Data, length = 1
main, WRITE: TLSv1 Application Data, length = 15846
main, WRITE: TLSv1 Application Data, length = 1
main, WRITE: TLSv1 Application Data, length = 14603
main, READ: TLSv1 Alert, length = 32
main, RECV TLSv1 ALERT:  fatal, bad_record_mac
main, fatal: engine already closed.  Rethrowing javax.net.ssl.SSLException: Received fatal alert: bad_record_mac
main, fatal: engine already closed.  Rethrowing javax.net.ssl.SSLException: Received fatal alert: bad_record_mac
main, called closeOutbound()
main, closeOutboundInternal()
main, SEND TLSv1 ALERT:  warning, description = close_notify
main, WRITE: TLSv1 Alert, length = 32
[Rollbar] Reporting exception: Received fatal alert: bad_record_mac (IOError)
[Rollbar] Exception not reported because Rollbar is disabled

Please let me know if there is anything I can do to provide more information. Unfortunately it is difficult to provide the best reproduction circumstances, because access to our S3 bucket is restricted, and there may even be network errors in play, although notably, I cannot reproduce this with MRI 1.9.3 or 2.0.0, while it does happen very consistently (at least every time the file is larger than the threshold) with JRuby.

@simonask
Copy link
Author

This seems related to #1080. Feel free to mark either as a duplicate if it turns out to be the same issue.

@simonask
Copy link
Author

@aarti Correct me if I'm wrong, but it seems to me that you're getting a different error? "Broken pipe (IOError)" is not the same thing as "bad_record_mac (SSLError)" — the first indicates a transport layer error, while the latter indicates a protocol error (supposedly).

The "bad_record_mac" error is never supposed to happen for correct implementations of the SSL protocol, which is why I'm suspecting a bug in something deeper than a high-level gem like fog.

@gmanfunky
Copy link

Another easy test-case.

jruby -ropen-uri -e "Thread.abort_on_exception = true ; 10.times { Thread.new { loop { open('https://rubygems.org/gems/berkshelf-3.1.1.gem').read; p :ok } } }; sleep"

Eventually fails on linux or ruby with jruby 1.7.11 and 1.7.12

This feels like it might be something related to jruby's re-implementation of openssl? I'm not sure how that works.

I'm not sure why it is named "openssl.".
Is that a re-write of openssl in java? Or am I missing something? I know we'd have to reimplement the "openssl gem". but i see all the logic as java instead of C. which makes me wonder about this re-write.

@gmanfunky
Copy link

I was able to reproduce the fail cases using nginx-1.0.15-5.el6.x86_64 ssl server on an old centos 6.2 image with openssl-1.0.0-20.el6_2.5.x86_64

@kgx
Copy link

kgx commented May 3, 2014

@gmanfunky It appears that this issue is tied to the Oracle JDK. If you have the option to use OpenJDK in your production environment I would recommend that.

@gmanfunky
Copy link

Yes, i agree that it does appear to be an issue when using Oracle JDK. But it is gnarly enough of an intermittent bug that I think JRuby might want to help identify a better test case and possible workaround and ultimately push for a fix upstream.

While I can set our JRuby to use OpenJDK in some cases, this bug will still manifest in an unrelated OracleJava-SSL-Client->nginx process as well.

@kgx
Copy link

kgx commented May 3, 2014

Yea, this is definitely a big problem. I still have intermittent failures on my Mac clients as there is not a suitable OpenJDK build yet for Mac OS.

@mkristian
Copy link
Member

I wanted to see if this got fixed with the current jruby-openssl
refactoring where we fix at least one oracle jdk related issue. BUT I can
not reproduce it with jruby-1.7.11, jruby-1.7.12 in combination with
open-jdk-7, oracle-7-jdk, oracle-8-jdk on the latest ubuntu.

if someone is willing to checkout the new jruby-openssl gem then I could
setup a branch on github.com/jruby/jruby which builds a jruby with it.

@kgx
Copy link

kgx commented May 7, 2014

@mkristian @gmanfunky @headius Hooray! I can no longer recreate this problem using jruby-openssl-0.9.5.dev from latest jruby/jruby commit on master. I am running jruby-1.7.12 on Oracle JDK 7u55 on OS X, after upgrading everything to latest.

It is worth mentioning that I was still experiencing intermittent SSL failures with latest jruby-openssl until I upgraded both JRuby (from 1.7.8) and the JDK (from 1.7u45), so something must have also changed elsewhere.

@mkristian
Copy link
Member

@kares FYI your refactoring did already some good ;)

@gmanfunky since you were seeing this error as well, could you give the jruby-openssl from ext/openssl from the jruby master also a trial.

@kares
Copy link
Member

kares commented May 10, 2014

@mkristian oh yeah ... the "big OpenSSL" cleanup was definitely worth it :) ... thanks!

@jgwmaxwell
Copy link

I'm still experiencing this on JRuby 1.7.13 and Oracle 7u65 64bit on Mavericks. I've tried pushing the version of JRuby OpenSSL to 0.9.6.dev, but this still errors on all uploads over 128KB. Has anyone else continued to experience this, or is this something environmental (in a more specific way than 'OS X')?

@rtyler
Copy link

rtyler commented Jan 14, 2015

We're seeing what may be the same issue in JRuby 1.7.18 running on JDK 7u40. @mkristian I've created an internal ticket with some production stack traces and links if you'd like to take a look at that.

@rtyler rtyler added this to the JRuby 1.7.19 milestone Jan 14, 2015
@rtyler rtyler removed this from the JRuby 1.7.19 milestone Jan 14, 2015
@rtyler rtyler added the openssl label Jan 15, 2015
@headius
Copy link
Member

headius commented Jan 26, 2015

@rtyler I'm disappointed to hear you might have the same issue. Please confirm that it's literally the same "bad_record_mac" bug. We have not had other reports on recent JDKs, so I start to twitch when I hear this might still be a bug.

Note also jruby/jruby-openssl#12, #1874, and other issues that involve updating our list of ciphers and default protocols.

@mkristian
Copy link
Member

it is seen with jdk-1.7u40 and not with jdk-1.7u72: the tests showed it is the same "bad_record_mac" and CAN be fixed by updating the jdk to the latest version.

the state of jossl is something which does not look nice IMO:

{
  "given_cipher_suites": [
    "TLS_RSA_WITH_AES_256_CBC_SHA",
    "TLS_DHE_RSA_WITH_AES_256_CBC_SHA",
    "TLS_DHE_DSS_WITH_AES_256_CBC_SHA",
    "TLS_RSA_WITH_AES_128_CBC_SHA",
    "TLS_DHE_RSA_WITH_AES_128_CBC_SHA",
    "TLS_DHE_DSS_WITH_AES_128_CBC_SHA",
    "TLS_RSA_WITH_3DES_EDE_CBC_SHA",
    "TLS_DHE_RSA_WITH_3DES_EDE_CBC_SHA",
    "TLS_DHE_DSS_WITH_3DES_EDE_CBC_SHA",
    "TLS_RSA_WITH_RC4_128_SHA",
    "TLS_RSA_WITH_RC4_128_MD5",
    "TLS_RSA_WITH_DES_CBC_SHA",
    "TLS_DHE_RSA_WITH_DES_CBC_SHA"
  ],
  "ephemeral_keys_supported": true,
  "session_ticket_supported": false,
  "tls_compression_supported": false,
  "unknown_cipher_suite_supported": false,
  "beast_vuln": true,
  "able_to_detect_n_minus_one_splitting": true,
  "insecure_cipher_suites": {
    "TLS_DHE_RSA_WITH_DES_CBC_SHA": [
      "uses keys smaller than 128 bits in its encryption"
    ],
    "TLS_RSA_WITH_DES_CBC_SHA": [
      "uses keys smaller than 128 bits in its encryption"
    ]
  },
  "tls_version": "TLS 1.0",
  "rating": "Bad"
}

in comparison to MRI

{
  "given_cipher_suites": [
    "TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384",
    "TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384",
    "TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384",
    "TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA384",
    "TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA",
    "TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA",
    "TLS_DHE_DSS_WITH_AES_256_GCM_SHA384",
    "TLS_DHE_RSA_WITH_AES_256_GCM_SHA384",
    "TLS_DHE_RSA_WITH_AES_256_CBC_SHA256",
    "TLS_DHE_DSS_WITH_AES_256_CBC_SHA256",
    "TLS_DHE_RSA_WITH_AES_256_CBC_SHA",
    "TLS_DHE_DSS_WITH_AES_256_CBC_SHA",
    "TLS_DHE_RSA_WITH_CAMELLIA_256_CBC_SHA",
    "TLS_DHE_DSS_WITH_CAMELLIA_256_CBC_SHA",
    "TLS_ECDH_anon_WITH_AES_256_CBC_SHA",
    "TLS_ECDH_RSA_WITH_AES_256_GCM_SHA384",
    "TLS_ECDH_ECDSA_WITH_AES_256_GCM_SHA384",
    "TLS_ECDH_RSA_WITH_AES_256_CBC_SHA384",
    "TLS_ECDH_ECDSA_WITH_AES_256_CBC_SHA384",
    "TLS_ECDH_RSA_WITH_AES_256_CBC_SHA",
    "TLS_ECDH_ECDSA_WITH_AES_256_CBC_SHA",
    "TLS_RSA_WITH_AES_256_GCM_SHA384",
    "TLS_RSA_WITH_AES_256_CBC_SHA256",
    "TLS_RSA_WITH_AES_256_CBC_SHA",
    "TLS_RSA_WITH_CAMELLIA_256_CBC_SHA",
    "TLS_ECDHE_RSA_WITH_3DES_EDE_CBC_SHA",
    "TLS_ECDHE_ECDSA_WITH_3DES_EDE_CBC_SHA",
    "TLS_DHE_RSA_WITH_3DES_EDE_CBC_SHA",
    "TLS_DHE_DSS_WITH_3DES_EDE_CBC_SHA",
    "TLS_ECDH_anon_WITH_3DES_EDE_CBC_SHA",
    "TLS_ECDH_RSA_WITH_3DES_EDE_CBC_SHA",
    "TLS_ECDH_ECDSA_WITH_3DES_EDE_CBC_SHA",
    "TLS_RSA_WITH_3DES_EDE_CBC_SHA",
    "TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256",
    "TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256",
    "TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256",
    "TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256",
    "TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA",
    "TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA",
    "TLS_DHE_DSS_WITH_AES_128_GCM_SHA256",
    "TLS_DHE_RSA_WITH_AES_128_GCM_SHA256",
    "TLS_DHE_RSA_WITH_AES_128_CBC_SHA256",
    "TLS_DHE_DSS_WITH_AES_128_CBC_SHA256",
    "TLS_DHE_RSA_WITH_AES_128_CBC_SHA",
    "TLS_DHE_DSS_WITH_AES_128_CBC_SHA",
    "TLS_DHE_RSA_WITH_CAMELLIA_128_CBC_SHA",
    "TLS_DHE_DSS_WITH_CAMELLIA_128_CBC_SHA",
    "TLS_ECDH_anon_WITH_AES_128_CBC_SHA",
    "TLS_ECDH_RSA_WITH_AES_128_GCM_SHA256",
    "TLS_ECDH_ECDSA_WITH_AES_128_GCM_SHA256",
    "TLS_ECDH_RSA_WITH_AES_128_CBC_SHA256",
    "TLS_ECDH_ECDSA_WITH_AES_128_CBC_SHA256",
    "TLS_ECDH_RSA_WITH_AES_128_CBC_SHA",
    "TLS_ECDH_ECDSA_WITH_AES_128_CBC_SHA",
    "TLS_RSA_WITH_AES_128_GCM_SHA256",
    "TLS_RSA_WITH_AES_128_CBC_SHA256",
    "TLS_RSA_WITH_AES_128_CBC_SHA",
    "TLS_RSA_WITH_CAMELLIA_128_CBC_SHA",
    "TLS_DHE_RSA_WITH_SEED_CBC_SHA",
    "TLS_DHE_DSS_WITH_SEED_CBC_SHA",
    "TLS_RSA_WITH_SEED_CBC_SHA",
    "TLS_ECDHE_RSA_WITH_RC4_128_SHA",
    "TLS_ECDHE_ECDSA_WITH_RC4_128_SHA",
    "TLS_ECDH_anon_WITH_RC4_128_SHA",
    "TLS_ECDH_RSA_WITH_RC4_128_SHA",
    "TLS_ECDH_ECDSA_WITH_RC4_128_SHA",
    "TLS_RSA_WITH_RC4_128_SHA",
    "TLS_RSA_WITH_RC4_128_MD5",
    "TLS_DHE_RSA_WITH_DES_CBC_SHA",
    "TLS_DHE_DSS_WITH_DES_CBC_SHA",
    "TLS_RSA_WITH_DES_CBC_SHA",
    "TLS_EMPTY_RENEGOTIATION_INFO_SCSV"
  ],
  "ephemeral_keys_supported": true,
  "session_ticket_supported": true,
  "tls_compression_supported": false,
  "unknown_cipher_suite_supported": false,
  "beast_vuln": false,
  "able_to_detect_n_minus_one_splitting": false,
  "insecure_cipher_suites": {
    "TLS_DHE_DSS_WITH_DES_CBC_SHA": [
      "uses keys smaller than 128 bits in its encryption"
    ],
    "TLS_DHE_RSA_WITH_DES_CBC_SHA": [
      "uses keys smaller than 128 bits in its encryption"
    ],
    "TLS_ECDH_anon_WITH_3DES_EDE_CBC_SHA": [
      "is open to man-in-the-middle attacks because it does not authenticate the server"
    ],
    "TLS_ECDH_anon_WITH_AES_128_CBC_SHA": [
      "is open to man-in-the-middle attacks because it does not authenticate the server"
    ],
    "TLS_ECDH_anon_WITH_AES_256_CBC_SHA": [
      "is open to man-in-the-middle attacks because it does not authenticate the server"
    ],
    "TLS_ECDH_anon_WITH_RC4_128_SHA": [
      "is open to man-in-the-middle attacks because it does not authenticate the server"
    ],
    "TLS_RSA_WITH_DES_CBC_SHA": [
      "uses keys smaller than 128 bits in its encryption"
    ]
  },
  "tls_version": "TLS 1.2",
  "rating": "Bad"
}

produced by https://gist.github.com/cscotta/8302049

@headius
Copy link
Member

headius commented Jan 27, 2015

@mkristian Definitely needs some love to bring it up to date :-(

@enebo
Copy link
Member

enebo commented May 17, 2017

@mkristian Is this an issue in 9.x?

1 similar comment
@enebo
Copy link
Member

enebo commented May 17, 2017

@mkristian Is this an issue in 9.x?

@mkristian
Copy link
Member

@enebo first it is more an issue with jruby-openssl the script from above https://gist.github.com/cscotta/8302049 on current jruby master still produce a rather small list of ciphers. the original error with bad_record_mac was an jdk problem and we can consider it fixed by now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants