Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Java Integration conversions #3105

Closed
InfraRuby opened this issue Jul 5, 2015 · 9 comments
Closed

Java Integration conversions #3105

InfraRuby opened this issue Jul 5, 2015 · 9 comments

Comments

@InfraRuby
Copy link

Java Integration converts java.lang.String <-> String and java.math.BigInteger <-> Integer.

These conversions violate Java and Ruby conventions for object identity:

a_string = "a ruby string"
Java::java.lang.System.identityHashCode(a_string) == Java::java.lang.System.identityHashCode(a_string) # => false

map = Java::java.util.IdentityHashMap.new
key = "a ruby string"
map.put(key, "value")
map.get(key) # => nil

m = Java::java.lang.Object[1].new
m[0] = "a ruby string"
m[0].equal?(m[0]) # => false

These conversions break some methods:

a_fixnum = 17
Java::java.math.BigInteger.valueOf(a_fixnum).class # => Bignum

These conversions lose information:

m = Java::java.lang.Object[2].new
m[0] = "a ruby string"
m[1] = "a java string".to_java
m[0].class # => String
m[1].class # => String

m = Java::java.lang.Object[2].new
m[0] = "a ruby string in utf-8"
m[1] = "a ruby string in us-ascii".force_encoding(Encoding::US_ASCII)
m[0].encoding # => Encoding::UTF_8
m[1].encoding # => Encoding::UTF_8

Would you consider a patch to disable these conversions on a per-source-file basis?

@headius
Copy link
Member

headius commented Jul 7, 2015

We have always wanted to eliminate these conversions, instead making java.lang.String and friends duck type to Ruby String (by defining equivalent versions of all String methods). The problem, however, is that changing this behavior globally would immediately break any code calling java.lang.String-based APIs if we don't coerce automatically.

An alternative might be that we only coerce if we can't fit our RubyString object into the target type; so CharSequence, Object, Serializable argument types would go in as RubyString, but an argument type of java.lang.String would autocoerce. I worry this would be even more confusing, since it would only coerce sometimes.

This is a tough question we've never found a suitable answer for. A high majority of users never notice that Ruby string and Java string are being converted automatically for them, because it makes the APIs work smoothly. But then you run into cases like this, where you have identity and information loss, and there's no good solution.

@InfraRuby
Copy link
Author

We are planning an open-source rewrite of InfraRuby (a compiler and runtime for a statically-typed Ruby). This issue is one area we want to improve in the rewrite.

The compiler presently applies those conversions (as InfraRuby is designed so InfraRuby code runs on Ruby interpreters without modification) except when compiling the runtime, because we can get away with that (we don't need the runtime to run on Ruby interpreters) and because those conversions would break a whole lot of stuff (for reasons given earlier).

Unfortunately, outside the runtime, the compiler must apply conversions (because JRuby does) and, for example, we can't implement generic containers backed by java.lang.Object?[] (because we can't do that in JRuby).

As part of the rewrite, InfraRuby source files will use ".ir" as the file extension.

JRuby could use file extensions and/or magic comments to select the behavior for Java Integration in each source file. A command-line option could enable warnings for conversions applied in source files using the default behavior.

This approach would give JRuby developers a testing ground for alternative behaviors and a migration path if one alternative is preferred to the current behavior.

@InfraRuby
Copy link
Author

@headius I'm closing this issue as another approach is possible, with a JRuby extension: the infraruby-java gem (MIT license) adds JAVA, which is similar to Java in JRuby, but without those conversions.

Thanks for JRuby; your work is excellent!

@kares
Copy link
Member

kares commented Mar 17, 2016

@InfraRuby might be interested in seeing some of your work and maybe doing smt similar with a --flag (would be off by default) to get better JI as an experiment under JRuby. is there a public source repo?

@headius
Copy link
Member

headius commented Mar 17, 2016

Another little bit of information...

We have always had some concerns about doing this automatic translation, mostly on the Java-to-Ruby side of things, since often you really, really want the actual Java object. For Ruby-to-Java some of these conversions are necessary, like being able to pass a Ruby string for a Java string, but that's harder to avoid.

One thought that @enebo and I had for years was to see how far we could get just making the Java types duck-type as the Ruby versions. So in that case, when you call a Java method and get back a java.lang.String, it's really, truly the actual Java string object, but we add on methods that Ruby code expects to see. If it walks like a String and quacks like a String...

@headius
Copy link
Member

headius commented Mar 17, 2016

Also, I think we'd all agree that preserving identity on numeric objects is a very risky proposition, so JRuby passing Fixnum by converting to one of the primitive Java integer types is pretty benign. The conversions that have always bothered me are the Java-to-Ruby "normal" objects like Strings (and I guess String is really the primary one...we don't convert many other things).

@InfraRuby
Copy link
Author

@kares now at https://github.com/InfraRuby/infraruby-java

The extension also adds Byte Char Int16 Int32 Int64 Float32 Float64 classes to wrap primitive values, consumed and produced by JAVA methods (and arrays):

JAVA::java.lang.Integer.compare(7.i32, 8.i32).class # => Int32

m = JAVA::byte[1].new
m[0] = 3.byte
m[0].class # => Byte
m.length.class # => Int32

Taking the examples in the issue, we now have no identity loss:

a_string = "a ruby string"
JAVA::java.lang.System.identityHashCode(a_string) == JAVA::java.lang.System.identityHashCode(a_string) # => true

map = JAVA::java.util.IdentityHashMap.new
key = "a ruby string"
map.put(key, "value")
map.get(key) # => "value"

m = JAVA::java.lang.Object[1].new
m[0] = "a ruby string"
m[0].equal?(m[0]) # => true

... no unwanted conversions:

JAVA::java.math.BigInteger.valueOf(17.i64).class # => JAVA::java.math.BigInteger

... and no information loss:

m = JAVA::java.lang.Object[2].new
m[0] = "a ruby string"
m[1] = "a java string".j
m[0].class # => String
m[1].class # => JAVA::java.lang.String

m = JAVA::java.lang.Object[2].new
m[0] = "a ruby string in utf-8"
m[1] = "a ruby string in us-ascii".force_encoding(Encoding::US_ASCII)
m[0].encoding # => Encoding::UTF_8
m[1].encoding # => Encoding::US_ASCII

@InfraRuby
Copy link
Author

@headius there are three options for Ruby-to-Java conversions:

  • JI should always convert, and we have identity and information loss.
  • JI should sometimes convert (when calling a method that takes a java.lang.String), and we have some confusion.
  • JI should never convert, and we have some clutter in the source to convert explicitly.

InfraRuby takes the third option: infraruby-java adds String#j (and String#to_j) to convert to a JAVA::java.lang.String if that's what you need/want.

@InfraRuby
Copy link
Author

@headius also this:

class CL1 < Java::java.lang.ClassLoader
  def findClass(s)
    # is s a String or a Java::java.lang.String?
    puts s.class
    return nil
  end
end

cl1 = CL1.new
cl1.findClass("some.Class".to_java) # s is a Java::java.lang.String
cl1.loadClass("some.Class".to_java) # s is a String

infraruby-java is less confusing here:

class CL2 < JAVA::java.lang.ClassLoader
  def findClass(s)
    # is s a String or a JAVA::java.lang.String?
    puts s.class
    return nil
  end
end

cl2 = CL2.new
cl2.findClass("some.Class".j) # s is a JAVA::java.lang.String
cl2.loadClass("some.Class".j) # s is a JAVA::java.lang.String

but, on the other hand:

cl2.findClass("some.Class") # s is a String
cl2.loadClass("some.Class") # TypeError

@enebo enebo added this to the Invalid or Duplicate milestone Apr 14, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants