Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: jruby/jruby
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: 4762cc4b417f
Choose a base ref
...
head repository: jruby/jruby
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: b28adbfa0552
Choose a head ref
  • 3 commits
  • 9 files changed
  • 1 contributor

Commits on Mar 26, 2015

  1. Copy the full SHA
    63f9178 View commit details
  2. Copy the full SHA
    20c6a62 View commit details

Commits on Mar 27, 2015

  1. [Truffle] Moved String#encode out to Rubinius and pulled in String#en…

    …code! from Rubinius.
    
    This commit also fixes several issues in our encoding implementation that surfaced through greater usage of Encoding and Encoding::Converter.
    nirvdrum committed Mar 27, 2015
    Copy the full SHA
    b28adbf View commit details

This file was deleted.

This file was deleted.

65 changes: 1 addition & 64 deletions spec/truffle/tags/core/string/encode_tags.txt
Original file line number Diff line number Diff line change
@@ -1,79 +1,16 @@
fails:String#encode when passed no options transcodes a 7-bit String despite no generic converting being available
fails:String#encode when passed no options raises an Encoding::ConverterNotFoundError when no conversion is possible
fails:String#encode when passed to encoding transcodes a 7-bit String despite no generic converting being available
fails:String#encode when passed to encoding raises an Encoding::ConverterNotFoundError when no conversion is possible
fails:String#encode when passed to encoding raises an Encoding::ConverterNotFoundError for an invalid encoding
fails:String#encode when passed options does not process transcoding options if not transcoding
fails:String#encode when passed options calls #to_hash to convert the object
fails:String#encode when passed options transcodes to Encoding.default_internal when set
fails:String#encode when passed options raises an Encoding::ConverterNotFoundError when no conversion is possible despite ':invalid => :replace, :undef => :replace'
fails:String#encode when passed to, from transcodes between the encodings ignoring the String encoding
fails:String#encode when passed to, from calls #to_str to convert the from object to an Encoding
fails:String#encode when passed to, options replaces undefined characters in the destination encoding
fails:String#encode when passed to, options replaces invalid characters in the destination encoding
fails:String#encode when passed to, options calls #to_hash to convert the options object
fails:String#encode when passed to, from, options replaces undefined characters in the destination encoding
fails:String#encode when passed to, from, options replaces invalid characters in the destination encoding
fails:String#encode when passed to, from, options calls #to_str to convert the to object to an encoding
fails:String#encode when passed to, from, options calls #to_str to convert the from object to an encoding
fails:String#encode when passed to, from, options calls #to_hash to convert the options object
fails:String#encode given the :xml => :text option replaces all instances of '&' with '&'
fails:String#encode given the :xml => :text option replaces all instances of '<' with '&lt;'
fails:String#encode given the :xml => :text option replaces all instances of '>' with '&gt;'
fails:String#encode given the :xml => :text option replaces undefined characters with their upper-case hexadecimal numeric character references
fails:String#encode given the :xml => :attr option surrounds the encoded text with double-quotes
fails:String#encode given the :xml => :attr option replaces all instances of '&' with '&amp;'
fails:String#encode given the :xml => :attr option replaces all instances of '<' with '&lt;'
fails:String#encode given the :xml => :attr option replaces all instances of '>' with '&gt;'
fails:String#encode given the :xml => :attr option replaces all instances of '"' with '&quot;'
fails:String#encode given the :xml => :attr option replaces undefined characters with their upper-case hexadecimal numeric character references
fails:String#encode when passed options returns a copy when Encoding.default_internal is nil
fails:String#encode when passed options normalizes newlines
fails:String#encode when passed to, from returns a copy when both encodings are the same
fails:String#encode when passed to, from returns the transcoded string
fails:String#encode when passed to, options returns a copy when the destination encoding is the same as the String encoding
fails:String#encode when passed to, from, options returns a copy when both encodings are the same
fails:String#encode! raises ArgumentError if the value of the :xml option is not :text or :attr
fails:String#encode! raises a RuntimeError when called on a frozen String
fails:String#encode! raises a RuntimeError when called on a frozen String when it's a no-op
fails:String#encode! when passed no options transcodes to Encoding.default_internal when set
fails:String#encode! when passed no options transcodes a 7-bit String despite no generic converting being available
fails:String#encode! when passed no options raises an Encoding::ConverterNotFoundError when no conversion is possible
fails:String#encode! when passed to encoding accepts a String argument
fails:String#encode! when passed to encoding calls #to_str to convert the object to an Encoding
fails:String#encode! when passed to encoding transcodes to the passed encoding
fails:String#encode! when passed to encoding transcodes Japanese multibyte characters
fails:String#encode! when passed to encoding transcodes a 7-bit String despite no generic converting being available
fails:String#encode! when passed to encoding raises an Encoding::ConverterNotFoundError when no conversion is possible
fails:String#encode! when passed to encoding raises an Encoding::ConverterNotFoundError for an invalid encoding
fails:String#encode! when passed options does not process transcoding options if not transcoding
fails:String#encode! when passed options calls #to_hash to convert the object
fails:String#encode! when passed options transcodes to Encoding.default_internal when set
fails:String#encode! when passed options raises an Encoding::ConverterNotFoundError when no conversion is possible despite ':invalid => :replace, :undef => :replace'
fails:String#encode! when passed to, from transcodes between the encodings ignoring the String encoding
fails:String#encode! when passed to, from calls #to_str to convert the from object to an Encoding
fails:String#encode! when passed to, options replaces undefined characters in the destination encoding
fails:String#encode! when passed to, options replaces invalid characters in the destination encoding
fails:String#encode! when passed to, options calls #to_hash to convert the options object
fails:String#encode! when passed to, from, options replaces undefined characters in the destination encoding
fails:String#encode! when passed to, from, options replaces invalid characters in the destination encoding
fails:String#encode! when passed to, from, options calls #to_str to convert the to object to an encoding
fails:String#encode! when passed to, from, options calls #to_str to convert the from object to an encoding
fails:String#encode! when passed to, from, options calls #to_hash to convert the options object
fails:String#encode! given the :xml => :text option replaces all instances of '&' with '&amp;'
fails:String#encode! given the :xml => :text option replaces all instances of '<' with '&lt;'
fails:String#encode! given the :xml => :text option replaces all instances of '>' with '&gt;'
fails:String#encode! given the :xml => :text option does not replace '"'
fails:String#encode! given the :xml => :text option replaces undefined characters with their upper-case hexadecimal numeric character references
fails:String#encode! given the :xml => :attr option surrounds the encoded text with double-quotes
fails:String#encode! given the :xml => :attr option replaces all instances of '&' with '&amp;'
fails:String#encode! given the :xml => :attr option replaces all instances of '<' with '&lt;'
fails:String#encode! given the :xml => :attr option replaces all instances of '>' with '&gt;'
fails:String#encode! given the :xml => :attr option replaces all instances of '"' with '&quot;'
fails:String#encode! given the :xml => :attr option replaces undefined characters with their upper-case hexadecimal numeric character references
fails:String#encode! when passed no options returns self when Encoding.default_internal is nil
fails:String#encode! when passed no options returns self for a ASCII-only String when Encoding.default_internal is nil
fails:String#encode! when passed options returns self for ASCII-only String when Encoding.default_internal is nil
fails:String#encode! when passed to encoding returns self
fails:String#encode! when passed to, from returns self
fails:String#encode raises ArgumentError if the value of the :xml option is not :text or :attr
fails:String#encode when passed to encoding transcodes Japanese multibyte characters
Original file line number Diff line number Diff line change
@@ -15,6 +15,7 @@
import com.oracle.truffle.api.source.SourceSection;

import org.jcodings.Encoding;
import org.jcodings.EncodingDB;
import org.jcodings.transcode.EConv;
import org.jcodings.transcode.Transcoder;
import org.jcodings.transcode.TranscoderDB;
@@ -51,7 +52,7 @@ public InitializeNode(InitializeNode prev) {

@TruffleBoundary
@Specialization
public RubyNilClass initialize(RubyEncodingConverter self, RubyString source, RubyString destination, UndefinedPlaceholder options) {
public RubyNilClass initialize(RubyEncodingConverter self, Object source, Object destination, Object options) {
notDesignedForCompilation();

// Adapted from RubyConverter - see attribution there
@@ -62,38 +63,28 @@ public RubyNilClass initialize(RubyEncodingConverter self, RubyString source, Ru
int[] ecflags = {0};
IRubyObject[] ecopts = {runtime.getNil()};

EncodingUtils.econvArgs(runtime.getCurrentContext(), new IRubyObject[]{getContext().toJRuby(source), getContext().toJRuby(destination)}, encNames, encs, ecflags, ecopts);
final IRubyObject sourceAsJRubyObj = getContext().toJRuby(source);
final IRubyObject destinationAsJRubyObj = getContext().toJRuby(destination);

EncodingUtils.econvArgs(runtime.getCurrentContext(), new IRubyObject[]{sourceAsJRubyObj, destinationAsJRubyObj}, encNames, encs, ecflags, ecopts);
EConv econv = EncodingUtils.econvOpenOpts(runtime.getCurrentContext(), encNames[0], encNames[1], ecflags[0], ecopts[0]);

if (econv == null) {
throw new UnsupportedOperationException();
}

self.setEConv(econv);

return nil();
}

@TruffleBoundary
@Specialization
public RubyNilClass initialize(RubyEncodingConverter self, RubyString source, RubyString destination, RubyHash options) {
notDesignedForCompilation();

// Adapted from RubyConverter - see attribution there

Ruby runtime = getContext().getRuntime();
Encoding[] encs = {null, null};
byte[][] encNames = {null, null};
int[] ecflags = {0};
IRubyObject[] ecopts = {runtime.getNil()};

EncodingUtils.econvArgs(runtime.getCurrentContext(), new IRubyObject[]{getContext().toJRuby(source), getContext().toJRuby(destination)}, encNames, encs, ecflags, ecopts);
EConv econv = EncodingUtils.econvOpenOpts(runtime.getCurrentContext(), encNames[0], encNames[1], ecflags[0], ecopts[0]);

if (econv == null) {
throw new UnsupportedOperationException();
if (!EncodingUtils.DECORATOR_P(encNames[0], encNames[1])) {
if (encs[0] == null) {
encs[0] = EncodingDB.dummy(encNames[0]).getEncoding();
}
if (encs[1] == null) {
encs[1] = EncodingDB.dummy(encNames[1]).getEncoding();
}
}

econv.sourceEncoding = encs[0];
econv.destinationEncoding = encs[1];

self.setEConv(econv);

return nil();
@@ -105,17 +96,23 @@ public RubyNilClass initialize(RubyEncodingConverter self, RubyString source, Ru
@CoreMethod(names = "transcoding_map", onSingleton = true)
public abstract static class TranscodingMapNode extends CoreMethodNode {

@Child private CallDispatchHeadNode upcaseNode;
@Child private CallDispatchHeadNode toSymNode;
@Child private CallDispatchHeadNode newLookupTableNode;
@Child private CallDispatchHeadNode lookupTableWriteNode;

public TranscodingMapNode(RubyContext context, SourceSection sourceSection) {
super(context, sourceSection);
upcaseNode = DispatchHeadNodeFactory.createMethodCall(context);
toSymNode = DispatchHeadNodeFactory.createMethodCall(context);
newLookupTableNode = DispatchHeadNodeFactory.createMethodCall(context);
lookupTableWriteNode = DispatchHeadNodeFactory.createMethodCall(context);
}

public TranscodingMapNode(TranscodingMapNode prev) {
super(prev);
upcaseNode = prev.upcaseNode;
toSymNode = prev.toSymNode;
newLookupTableNode = prev.newLookupTableNode;
lookupTableWriteNode = prev.lookupTableWriteNode;
}
@@ -125,7 +122,8 @@ public RubyHash transcodingMap(VirtualFrame frame) {
List<KeyValue> entries = new ArrayList<>();

for (RubyEncoding e : RubyEncoding.cloneEncodingList()) {
final RubySymbol key = getContext().newSymbol(e.getName());
final Object upcased = upcaseNode.call(frame, getContext().makeString(e.getName()), "upcase", null);
final Object key = toSymNode.call(frame, upcased, "to_sym", null);
final Object value = newLookupTableNode.call(frame, getContext().getCoreLibrary().getLookupTableClass(), "new", null);

final Object tupleValues = new Object[2];
Original file line number Diff line number Diff line change
@@ -1124,77 +1124,6 @@ public boolean empty(RubyString string) {
}
}

@CoreMethod(names = "encode", optional = 2)
public abstract static class EncodeNode extends CoreMethodNode {

@Child private ToStrNode toStrNode;
@Child private CallDispatchHeadNode defaultInternalNode;

public EncodeNode(RubyContext context, SourceSection sourceSection) {
super(context, sourceSection);
}

public EncodeNode(EncodeNode prev) {
super(prev);
}

@TruffleBoundary
@Specialization
public RubyString encode(RubyString string, RubyString encoding, @SuppressWarnings("unused") UndefinedPlaceholder options) {
final org.jruby.RubyString jrubyString = getContext().toJRuby(string);
final org.jruby.RubyString jrubyEncodingString = getContext().toJRuby(encoding);
final org.jruby.RubyString jrubyTranscoded = (org.jruby.RubyString) jrubyString.encode(getContext().getRuntime().getCurrentContext(), jrubyEncodingString);

return getContext().toTruffle(jrubyTranscoded);
}

@Specialization
public RubyString encode(RubyString string, RubyString encoding, @SuppressWarnings("unused") RubyHash options) {

// TODO (nirvdrum 20-Feb-15) We need to do something with the options hash. I'm stubbing this out just to get the jUnit mspec formatter running.
return encode(string, encoding, UndefinedPlaceholder.INSTANCE);
}

@TruffleBoundary
@Specialization
public RubyString encode(RubyString string, RubyEncoding encoding, @SuppressWarnings("unused") UndefinedPlaceholder options) {

final org.jruby.RubyString jrubyString = getContext().toJRuby(string);
final org.jruby.RubyString jrubyEncodingString = getContext().toJRuby(getContext().makeString(encoding.getName()));
final org.jruby.RubyString jrubyTranscoded = (org.jruby.RubyString) jrubyString.encode(getContext().getRuntime().getCurrentContext(), jrubyEncodingString);

return getContext().toTruffle(jrubyTranscoded);
}

@Specialization(guards = { "!isRubyString(arguments[1])", "!isRubyEncoding(arguments[1])", "!isUndefinedPlaceholder(arguments[1])" })
public RubyString encode(VirtualFrame frame, RubyString string, Object encoding, UndefinedPlaceholder options) {

if (toStrNode == null) {
CompilerDirectives.transferToInterpreter();
toStrNode = insert(ToStrNodeFactory.create(getContext(), getSourceSection(), null));
}

return encode(string, toStrNode.executeRubyString(frame, encoding), options);
}

@Specialization
public RubyString encode(VirtualFrame frame, RubyString string, @SuppressWarnings("unused") UndefinedPlaceholder encoding, @SuppressWarnings("unused") UndefinedPlaceholder options) {

if (defaultInternalNode == null) {
CompilerDirectives.transferToInterpreter();
defaultInternalNode = insert(DispatchHeadNodeFactory.createMethodCall(getContext()));
}

final Object defaultInternalEncoding = defaultInternalNode.call(frame, getContext().getCoreLibrary().getEncodingClass(), "default_internal", null);

if (defaultInternalEncoding == nil()) {
return encode(string, RubyEncoding.getEncoding("UTF-8"), UndefinedPlaceholder.INSTANCE);
}

return encode(string, (RubyEncoding) defaultInternalEncoding, UndefinedPlaceholder.INSTANCE);
}
}

@CoreMethod(names = "encoding")
public abstract static class EncodingNode extends CoreMethodNode {

Original file line number Diff line number Diff line change
@@ -6,20 +6,26 @@
* Eclipse Public License version 1.0
* GNU General Public License version 2
* GNU Lesser General Public License version 2.1
*
* Contains code modified from JRuby's RubyConverter.java
*/
package org.jruby.truffle.nodes.rubinius;

import com.oracle.truffle.api.dsl.Specialization;
import com.oracle.truffle.api.source.SourceSection;
import org.jcodings.Encoding;
import org.jcodings.Ptr;
import org.jcodings.transcode.EConv;
import org.jcodings.transcode.EConvResult;
import org.jruby.Ruby;
import org.jruby.runtime.builtin.IRubyObject;
import org.jruby.truffle.runtime.RubyContext;
import org.jruby.truffle.runtime.control.RaiseException;
import org.jruby.truffle.runtime.core.RubyArray;
import org.jruby.truffle.runtime.core.RubyBasicObject;
import org.jruby.truffle.runtime.core.RubyEncoding;
import org.jruby.truffle.runtime.core.RubyEncodingConverter;
import org.jruby.truffle.runtime.core.RubyException;
import org.jruby.truffle.runtime.core.RubyHash;
import org.jruby.truffle.runtime.core.RubyString;
import org.jruby.util.ByteList;
@@ -60,11 +66,93 @@ public EncodingConverterPrimitiveConvertNode(EncodingConverterPrimitiveConvertNo
}

@Specialization
public Object encodingConverterPrimitiveConvert(RubyBasicObject encodingConverter, RubyString source,
public Object encodingConverterPrimitiveConvert(RubyEncodingConverter encodingConverter, RubyString source,
RubyString target, int offset, int size, RubyHash options) {
throw new UnsupportedOperationException("not implemented");
}

@Specialization
public Object encodingConverterPrimitiveConvert(RubyEncodingConverter encodingConverter, RubyString source,
RubyString target, int offset, int size, int options) {

// Taken from org.jruby.RubyConverter#primitive_convert.

source.modify();
source.clearCodeRange();

target.modify();
target.clearCodeRange();

final ByteList inBytes = source.getByteList();
final ByteList outBytes = target.getByteList();

final Ptr inPtr = new Ptr();
final Ptr outPtr = new Ptr();

final EConv ec = encodingConverter.getEConv();

final boolean changeOffset = (offset == 0);
final boolean growOutputBuffer = (size == -1);

if (size == -1) {
size = 16; // in MRI, this is RSTRING_EMBED_LEN_MAX

if (size < source.getByteList().getRealSize()) {
size = source.getByteList().getRealSize();
}
}

while (true) {

if (changeOffset) {
offset = outBytes.getRealSize();
}

if (outBytes.getRealSize() < offset) {
throw new RaiseException(
getContext().getCoreLibrary().argumentError("output offset too big", this)
);
}

long outputByteEnd = offset + size;

if (outputByteEnd > Integer.MAX_VALUE) {
// overflow check
throw new RaiseException(
getContext().getCoreLibrary().argumentError("output offset + bytesize too big", this)
);
}

outBytes.ensure((int)outputByteEnd);

inPtr.p = inBytes.getBegin();
outPtr.p = outBytes.getBegin() + offset;
int os = outPtr.p + size;
EConvResult res = ec.convert(inBytes.getUnsafeBytes(), inPtr, inBytes.getRealSize() + inPtr.p, outBytes.getUnsafeBytes(), outPtr, os, options);

outBytes.setRealSize(outPtr.p - outBytes.begin());

source.getByteList().setRealSize(inBytes.getRealSize() - (inPtr.p - inBytes.getBegin()));
source.getByteList().setBegin(inPtr.p);

if (growOutputBuffer && res == EConvResult.DestinationBufferFull) {
if (Integer.MAX_VALUE / 2 < size) {
throw new RaiseException(
getContext().getCoreLibrary().argumentError("too long conversion result", this)
);
}
size *= 2;
continue;
}

if (ec.destinationEncoding != null) {
outBytes.setEncoding(ec.destinationEncoding);
}

return getContext().newSymbol(res.symbolicName());
}
}

}

@RubiniusPrimitive(name = "encoding_converter_putback")
@@ -97,8 +185,16 @@ public EncodingConverterLastErrorNode(EncodingConverterLastErrorNode prev) {
}

@Specialization
public Object encodingConverterLastError(RubyBasicObject encodingConverter) {
throw new UnsupportedOperationException("not implemented");
public Object encodingConverterLastError(RubyEncodingConverter encodingConverter) {
notDesignedForCompilation();

final org.jruby.exceptions.RaiseException e = EncodingUtils.makeEconvException(getContext().getRuntime(), encodingConverter.getEConv());

if (e == null) {
return nil();
}

return getContext().toTruffle(e.getException());
}

}
Original file line number Diff line number Diff line change
@@ -331,8 +331,10 @@ public IRubyObject toJRuby(Object object) {
return toJRuby((RubyString) object);
} else if (object instanceof RubyArray) {
return toJRuby((RubyArray) object);
} else if (object instanceof RubyEncoding) {
return toJRuby((RubyEncoding) object);
} else {
throw getRuntime().newRuntimeError("cannot pass " + object + " to JRuby");
throw getRuntime().newRuntimeError("cannot pass " + object + " (" + object.getClass().getName() + ") to JRuby");
}
}

@@ -349,6 +351,10 @@ public org.jruby.RubyArray toJRuby(RubyArray array) {
return runtime.newArray(store);
}

public IRubyObject toJRuby(RubyEncoding encoding) {
return runtime.getEncodingService().rubyEncodingFromObject(runtime.newString(encoding.getName()));
}

public org.jruby.RubyString toJRuby(RubyString string) {
final org.jruby.RubyString jrubyString = runtime.newString(string.getBytes().dup());

Original file line number Diff line number Diff line change
@@ -54,7 +54,9 @@ public void set(ByteList bytes) {
}

public void forceEncoding(Encoding encoding) {
this.bytes.setEncoding(encoding);
modify();
clearCodeRange();
StringSupport.associateEncoding(this, encoding);
clearCodeRange();
}

92 changes: 92 additions & 0 deletions truffle/src/main/ruby/core/rubinius/common/string.rb
Original file line number Diff line number Diff line change
@@ -527,6 +527,98 @@ def codepoints
end
end

def encode!(to=undefined, from=undefined, options=undefined)
Rubinius.check_frozen

case to
when Encoding
to_enc = to
when Hash
options = to
to_enc = Encoding.default_internal
when undefined
to_enc = Encoding.default_internal
return self unless to_enc
else
opts = Rubinius::Type::check_convert_type to, Hash, :to_hash

if opts
options = opts
to_enc = Encoding.default_internal
else
to_enc = Rubinius::Type.try_convert_to_encoding to
end
end

case from
when undefined
from_enc = encoding
when Encoding
from_enc = from
when Hash
options = from
from_enc = encoding
else
opts = Rubinius::Type::check_convert_type from, Hash, :to_hash

if opts
options = opts
from_enc = encoding
else
from_enc = Rubinius::Type.coerce_to_encoding from
end
end

if undefined.equal? from_enc or undefined.equal? to_enc
raise Encoding::ConverterNotFoundError, "undefined code converter (#{from} to #{to})"
end

case options
when undefined
options = 0
when Hash
# do nothing
else
options = Rubinius::Type.coerce_to options, Hash, :to_hash
end

if ascii_only? and from_enc.ascii_compatible? and to_enc and to_enc.ascii_compatible?
force_encoding to_enc
elsif to_enc and from_enc != to_enc
ec = Encoding::Converter.new from_enc, to_enc, options
dest = ""
status = ec.primitive_convert self.dup, dest, nil, nil, ec.options
raise ec.last_error unless status == :finished
replace dest
end

# TODO: replace this hack with transcoders
if options.kind_of? Hash
case xml = options[:xml]
when :text
gsub!(/[&><]/, '&' => '&amp;', '>' => '&gt;', '<' => '&lt;')
when :attr
gsub!(/[&><"]/, '&' => '&amp;', '>' => '&gt;', '<' => '&lt;', '"' => '&quot;')
insert(0, '"')
insert(-1, '"')
when nil
# nothing
else
raise ArgumentError, "unexpected value for xml option: #{xml.inspect}"
end

if options[:universal_newline]
gsub!(/\r\n|\r/, "\r\n" => "\n", "\r" => "\n")
end
end

self
end

def encode(to=undefined, from=undefined, options=undefined)
dup.encode! to, from, options
end

def end_with?(*suffixes)
suffixes.each do |original_suffix|
suffix = Rubinius::Type.check_convert_type original_suffix, String, :to_str