Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Output unreliable from spawned processes #779

Closed
dmarcotte opened this issue May 30, 2013 · 9 comments
Closed

Output unreliable from spawned processes #779

dmarcotte opened this issue May 30, 2013 · 9 comments

Comments

@dmarcotte
Copy link
Contributor

(This is probably a known issue, but I didn't see a issue for it. I wanted to document it here so that I can refer to in the tags I'm going to add to the specs it affects.)

There seems to be a race to capture the output of spawned processes in the tests. The race is lost much more often on Linux, but can also manifest itself on Mac.

Looking at the code, it smells like this might be due to how the output_to_fd matcher works in combination with the known race conditions which also lead to the inappropriate echild errors. Haven't been able to prove that yet though...

Anyhow, here's a summary of how exactly this manifests in the tests:

Here's a case from Travis where the output goes to the test's stdout rather than being consumed by the test
(from this log). Note how the foo the test is outputting is appended to the test output (element to a Stringfoo) rather than being received by the test matcher, which gets "":

[java] Kernel#spawn with a command array
[java] - does not subject the arguments to shell expansion
[java] - preserves whitespace in passed arguments
[java] - calls #to_str to convert the first element to a String
[java] - calls #to_str to convert the second element to a Stringfoo
[java]  (FAILED - 1)
    ---SNIP---
[java] 1)
[java] Kernel#spawn with a command array calls #to_str to convert the second element to a String FAILED
[java] Expected (STDOUT): "foo\n"
[java]           but got: ""

Alternatively, there are also cases where a previous test's output is still around to be consumed by the next test (from this log). Note how the directory listing which is the output of subjects the specified command to shell expansion (which passed) interferes with the foo that calls #to_str to convert the argument to a String is trying to validate:

[java] Process.spawn with a single argument
[java] - subjects the specified command to shell expansion
[java] - calls #to_str to convert the argument to a String (FAILED - 1)
    ---SNIP---
[java] 1)
[java] Process.spawn with a single argument calls #to_str to convert the argument to a String FAILED
[java] Expected (STDOUT): "foo\n"
[java]          but got: "COPYING LEGAL LICENSE.RUBY README.md Rakefile antlib bench bin build build.xml build_lib cext default.build.properties docs docs.iml findbugs.xml gems install ivy jruby-common.iml jruby-complete.iml jruby-core.iml jruby-dist.iml jruby-rake-plugin.iml jruby-stdlib.iml jruby.bnd.template jruby.iml lib man.iml maven nbproject netbeans-ant.xml pom.xml rakelib rubyspec_temp samples shared.iml spec spi src test tool\nfoo\n"

Hopefully that isn't too verbose and helps a bit... a pull tagging specs which are susceptible to this is on its way.

@atambo
Copy link
Member

atambo commented Aug 3, 2013

Now that rails runs CI tests against JRuby I see that the rails generator tests are hitting this problem. You can see the failures here:

https://travis-ci.org/rails/rails/jobs/9807848

When I run

output = capture(:stdout){ Rails::Generators.invoke :model, [] }
assert_match(/Description:/, output)

in irb things work as expected but when run in those tests some of the stdout is not captured and some is.

I'm guessing this is related because the generators seem to spawn processes

The implementation of capture can be found here:

https://github.com/rails/rails/blob/master/activesupport/lib/active_support/core_ext/kernel/reporting.rb#L67-L81

Not sure if there's maybe some change we can make to capture to be more reliable or if we need to actually fix the subprocess launching in jruby.

@BanzaiMan
Copy link
Member

#967 may also be related.

@BanzaiMan
Copy link
Member

Injecting a little sleep, as short as 50ms, seems effective. In the example in #781, on my MBA, it was able to run through 8000+ iterations without errors.

@recena
Copy link

recena commented Sep 7, 2013

I don't know when I should use Jira or Github. I began to write here.
This delay works fine for this example but not with others. For example, in Redmine (I use it with JRuby + Tomcat) use this method to invoke shell command:

def self.shellout(cmd, options = {}, &block)
  if logger && logger.debug?
    logger.debug "Shelling out: #{strip_credential(cmd)}"
    # Capture stderr in a log file
    if stderr_log_file
      cmd = "#{cmd} 2>>#{shell_quote(stderr_log_file)}"
    end
  end
  begin
    mode = "r+"
    IO.popen(cmd, mode) do |io|
      io.set_encoding("ASCII-8BIT") if io.respond_to?(:set_encoding)
      io.close_write unless options[:write_stdin]
      block.call(io) if block_given?
    end
  ## If scm command does not exist,
  ## Linux JRuby 1.6.2 (ruby-1.8.7-p330) raises java.io.IOException
  ## in production environment.
  # rescue Errno::ENOENT => e
  rescue Exception => e
    msg = strip_credential(e.message)
    # The command failed, log it and re-raise
    logmsg = "SCM command failed, "
    logmsg += "make sure that your SCM command (e.g. svn) is "
    logmsg += "in PATH (#{ENV['PATH']})\n"
    logmsg += "You can configure your scm commands in config/configuration.yml.\n"
    logmsg += "#{strip_credential(cmd)}\n"
    logmsg += "with: #{msg}"
    logger.error(logmsg)
    raise CommandFailed.new(msg)
  end
end

Subversion commands work fine, but with Git doesn't work normally "Bad File Descriptor".

@recena
Copy link

recena commented Sep 8, 2013

I have modified a bit this method:

def self.shellout(cmd, options = {}, &block)
  if logger && logger.debug?
    logger.debug "Shelling out: #{strip_credential(cmd)}"
    # Capture stderr in a log file
    if stderr_log_file
      cmd = "#{cmd} 2>>#{shell_quote(stderr_log_file)}"
    end
  end
  begin
    mode = "r+"
    if block_given?
      IO.popen(cmd, mode, &block)
    else
      `#{git_cmd}`.chomp
    end
  ## If scm command does not exist,
  ## Linux JRuby 1.6.2 (ruby-1.8.7-p330) raises java.io.IOException
  ## in production environment.
  # rescue Errno::ENOENT => e
  rescue Exception => e
    msg = strip_credential(e.message)
    # The command failed, log it and re-raise
    logmsg = "SCM command failed, "
    logmsg += "make sure that your SCM command (e.g. svn) is "
    logmsg += "in PATH (#{ENV['PATH']})\n"
    logmsg += "You can configure your scm commands in config/configuration.yml.\n"
    logmsg += "#{strip_credential(cmd)}\n"
    logmsg += "with: #{msg}"
    logger.error(logmsg)
    raise CommandFailed.new(msg)
  end
end

Everything works fine. I'm a newbie with Ruby but I'm studying the differences and his relation with the JRuby Subprocess.

jbbarth pushed a commit to redmine/redmine that referenced this issue Feb 11, 2014
SCM tests fail randomly due to IO.popen().

See:
jruby/jruby#779

git-svn-id: http://svn.redmine.org/redmine/trunk@12875 e93f8b46-1217-0410-a6f0-8f06a7374b81
@colinsurprenant
Copy link
Contributor

since #781 has been closed in reference to this issue: I have been bit with the #<Errno::EBADF: Bad file descriptor - Bad file descriptor> on close. Here's how to reproduce:

loop do
  puts("#{i}") if (i += 1) % 100 == 0

  f = IO.popen(["echo", "foobar"], "r+")
  f.readlines
  f.close rescue puts($!.inspect)
end

It typically generates a few exceptions every hundred invocations.

changing mode from "r+" to "r" solves the problem.

@matthewbloch
Copy link

This bug (thanks Java!) is the reason Redmine's SCM integration breaks when running under JRuby - http://www.redmine.org/boards/1/topics/38861 . @headius has written about this before but I'm not sure there's much that can help since he wrote http://blog.headius.com/2013/06/the-pain-of-broken-subprocess.html

@jsvd
Copy link
Contributor

jsvd commented Mar 3, 2015

Hit the issue on elasticsearch/logstash again using jruby 1.7.{17,19}.
9.0.0.0pre1 is not affected, from what I tested using @colinsurprenant' script above.
[edit] "a+" is also affected.

@headius
Copy link
Member

headius commented Mar 29, 2015

JRuby 9000 has completely rewritten process-spawning logic to match MRI (direct ports in many cases, but using posix_spawn). I would be surprised if it's not considerably more reliable now since we're routing around the JDK's nonsense.

Going to optimistically call this fixed in 9k.

@headius headius closed this as completed Mar 29, 2015
@headius headius added this to the 9.0.0.0.pre2 milestone Mar 29, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

8 participants